ADV A DVA A NCED STATISTICAL METHODS FOR ENGINEERS
Chapter Zero Welcome to Advanced Statistical Methods for Engineers!
Ground Gro und rule rules s – plea please… se… • •
• • • • • •
Use name tents Cell phones: – Turn off off or use vibrate – Take phone calls outside Keep Ke ep si side de co conv nver ersa sati tion ons s to a mini minimu mum m Be pro promp mptt in in retu return rnin ing g fro from m bre break aks s Don’ Do n’tt do do oth other er wo work rk du duri ring ng cl clas ass s Let inst instruc ructor tor kno know w if you you need need to to leave leave for for more more than than 30 30 minute minutes s List Li sten en wi with th an ope open n and and ac acti tive ve mi mind nd… … If yo you u have have a que quest stio ion n at at any any ti time me,, ask! – Other Ground Rules wanted by students?….. – Class agree to these Ground Rules? 2
1
Agenda Day 1 8:00 9:00
Ch 0: Welcome Welcome
Day 2 Ch 3: Distribution Distribution Analys is
Day 3
Day 4
Ch 5: Regression Regression and GLM
Ch 6: Logistic Regression
Ch 1: ANOVA ANOVA and Equivalence Testing Ch 7: Statistical Statistical Resources
10:00
End of Day Review
11:00 12:00 1:00
Online Evaluations
Lunch on your own
Lunch on your own
Lunch on your own
Ch 2: Measurement Systems Analysis
Ch 4: Process Process Capability and Tolerance Intervals
Ch 5: Regression Regression and GLM continued
2:00
Lunch on your own
3:00 4:00
En d o f Day Rev ie iew
En d o f Day Rev ie iew
En d o f Day Rev ie iew
5:00
Breaks as Needed 3
Logistics • Startin Starting g Time: Time: 8:00 • Ending Time: Time: Not later later than 5:00 5:00 • Lunc Lunch h 12:0012:00-1:00 1:00 • Breaks every 90-120 minutes • Powe Powerr Out Outlet lets s • Rest Room Room Locat Location ion • Food and drink locations locations (snacks, cafeteria cafeteria,, etc)
4
2
You Need ... – Laptop with MINITAB MINITAB and a working wireless Internet Connection – Writing instruments instruments – Access to data files files
5
Icebreaker (5 Minutes) In my journey through the world of statistics…
One thing that has worked well for me is …
One thing that has been a challenge for me is … (Extra Credit)
My favorite statistician, living or dead, is . . .
My favorite statistics joke is …
6
3
Expectations – Tools, tools, tools… •
Course may overlap with material from DRM or Lean Sigma
•
Tools may be familiar, but the intent is to present the tools with a focus on statistical thinking and decision-making.
•
Topics may be explored in greater mathematical depth than is offered in other curricula.
– Benefits •
A deep mathematical dive can actually help you better see the surface.
•
Awareness of mathematical assumptions is a critical first step for growing in your statistical knowledge, but advanced practitioners need to know:
•
–
Which assumptions are most critical?
–
When is it appropriate to break the rules?
–
What are the consequences of breaking the rules?
Statistical sophistication allows for flexibility and creativity in problem solving. 7
Expectations – Experience Chart •
Mark an X in column that best describes your experience with each topic Topic
None
A Little
Comfortable Proficient
Icould teach it
EquivalenceTesting Tolerance Intervals ANOVA Signal Interpretation Measurement Systems Analysis Distribution Analysis
– Your Expectations •
Create a list at your table
•
Each table will report
•
Spokesperson: skip items already mentioned
Process Capability General Linear Models
– Time: 10 Minutes 8
4
Your Feedback is Critical • September 17-20 represents the first wave of Advanced SME at MDT • Given that many of you already are leaders in the statistical or DRM worlds, your suggestions for course improvements are extremely important! • At the end of each day, we will engage in brief feedback session. • At the end of the week, there will be an online survey for you to formally evaluate the course. • If you wish to provide more detailed feedback, please send an email to the instructor team: Leroy Mattson, Karen Hulting, Jeremy Strief, Tom Keenan, Grant Short, Dayna Cruz
9 | MDT Confidential
What questions do you have?
10
5
Chapter 1: ANOVA and Equivalence Testing
Topics • Quality Trainer Review • ANOVA – Assumptions – Using Minitab Assistant vs Stat Menu – Calculation Deep Dive – Sample Size – ANOVA Signals
• Equivalence Testing
2 | MDT Confidential
1
Quality Trainer Review
3 | MDT Confidential
Comparing Grouped Data: Variables Data Response
4 | MDT Confidential
2
ANOVA: ASSUMPTIONS
5 | MDT Confidential
One-way ANOVA: Testing for the significance of one factor •
The null hypothesis: – H0: μ1 = μ2 = … μk – Meaning that the population (response) means are equal at each of the k levels of this factor or the factor is NOT significant.
•
The alternative hypothesis: – H A: at least two population means are unequal – Meaning that the factor IS significant
•
Perform the One-way ANOVA and reject the null hypothesis if the p-value is < alpha – Usually alpha = 0.05 (or 0.10 or 0.01) – A way to remember: “If p is low – the null must go”.
6 | MDT Confidential
3
ANOVA: General Process Steps • Select a model • Plan sample size using relevant data or guesses • (Optional) Simulate the data and try the analysis • Collect real data • Fit the model (perform ANOVA and get p value) • Examine the residuals • Transform the response or update the model, if necessary • State conclusion 7 | MDT Confidential
Typical Assumptions for ANOVA Factors • Factors (or “Inputs”) – Each factor can be set to two or more distinct levels – Factor levels can be measured adequately – Factor levels are “fixed” rather than “random” – For multiple factors, all combinations of all levels are represented (levels are “completely crossed”)
8 | MDT Confidential
4
Typical Assumptions for ANOVA Responses • Response data is “complete”, not censored • Some software requires “balanced” data – same sample size for each level of the input factor • Assumptions on Residuals – Residual = Response – Fitted Value – Normally distributed – Equal variance (assumption relaxed in Minitab Assistant) – Independent (e.g. no time trend)
9 | MDT Confidential
ANOVA CALCULATIONS DEEP DIVE: STAT MENU & MINITAB ASSISTANT
10 | MDT Confidential
5
ANOVA Calculations • See www.khanacademy.org – ANOVA 1 – Calculating SST (7:39) – ANOVA 2 – Calculating SSW and SSB (13:20) – ANOVA 3 – Hypothesis Test and F Statistic (10:14)
11 | MDT Confidential
Minitab Analysis of Kahn Dataset Can arrange either Stacked or Unstacked
12 | MDT Confidential
6
Consider a PQ Dataset • Three runs of n=10 units produced and tensile tested • See Ch1DataFile.mtw • Columns TipTensile1, TipTensile2, TipTensile3
13 | MDT Confidential
Minitab Options • Could use – Stat -> ANOVA – -> One way – -> One way (Unstacked) – -> General Linear Model – Stat -> Regression -> General Regression – Minitab Assistant
• Data arrangement – Stacked (one column for X, one column for Y) – Unstacked (Y values in columns for each X)
14 | MDT Confidential
7
ANOVA using Minitab Statistics Menu
15 | MDT Confidential
Stat Menu Outputs
S, R2 and adjusted R2 are measures of how well the model fits the data. 16 | MDT Confidential
8
Judging model fit •
S is measured in the units of the response variable and represents the standard distance data values fall from the fitted values – For a given study, the better the model predicts the response, the lower S is
•
R2 (R-Sq) describes the amount of variation in the observed response values that is explained by the predictor(s) – R2 always increases with additional predictors. – R2 is most useful when comparing models of the same size
•
Adjusted R2 is a modified R 2 that has been adjusted for the number of terms in the model – R2 can be artificially high with unnecessary terms, while adjusted R 2 may get smaller when terms are added to the model – Use adjusted R2 to compare models with different numbers of predictors
17 | MDT Confidential
Comparisons Output
18 | MDT Confidential
9
ANOVA – Examining Residuals 1) Test for Normality Normal Probability Plot is a Straight line
2) Test for Equal Variances Residual vs. Fitted Values is evenly distributed around the 0 line
Using the Stacked arrangement, there would also be a 4th Residual plot – Time Order. This is a Test for Independence – looking for a pattern over time.
Residuals are strongly non-normal . . . Possible Causes: • Failure of Equal Variance Assumption • Outliers • Missing Important Factors in the Model • Data is from Non-Normal Population What to do? • Check for Outliers • Check if Equal Variance is satisfied • Perform Normality Test • If data is from Non-Normal Population consider using Non-Parametric Tests or Transform the Response variable
10
If Residuals differ Group to Group
Possible Causes: • Non-Constant Variance • Outliers • Missing Important Factors in the Model
What to do? • Test for equal variance assumption using Stat > ANOVA > Test for Equal Variances • If test indicates unequal variances then consider transforming the response variable • Verify if the outlier is a data entry error • Add the factor into the model
If there is a time pattern in the data . . .
What to do? • Prevent by Randomizing • A time effect may be present • Consider time series procedure
11
Common Transformations Transformation
Comments
y
Appropriate for Poisson Distributed Data
Log(y)
If the Response is exponentially increasing then this transformation is appropriate
1/y
Appropriate when responses are close to zero
sin
1
Called the Arcsine Square Root function. Appropriate when Response is a proportion between zero and one.
y
Another useful tool is Box-Cox Transformation
Minitab Box - Cox Procedure : Y Y
,
when
0
Y log e(Y ), when
0
Minitab Screenshots
Box-Cox Transformation in Minitab Minitab > Stat > Control Charts > Box-Cox Transformation
Box-Cox Plot of Data 1
12
Lower C L
Upper CL Lambda (using 95.0% confidence) Estimate
10
Lower CL Upper CL
8 v e D t S
R o u nd e d V a u l e
0.03 -0.30 0.38 0 . 00
6 4 2 Limit 0 -1
0
1
2
3
Lambda
12
ANOVA using Minitab Assistant
http://www.minitab.com/support/documentation/Answers/Assistant%20White%20Papers/OneWayANOVA_MtbAsstMenuWhitePaper.pdf 25 | MDT Confidential
Report Card
26 | MDT Confidential
13
Diagnostic Report
27 | MDT Confidential
Power Report
28 | MDT Confidential
14
Summary Report
29 | MDT Confidential
ANOVA - Exercise • Use Ch1DataFile.mtw • Test for differences between the group means using both Stat menu ANOVA and Minitab Assistant ANOVA . . . for these 3-lot PQ studies: – For TubeTensile1, TubeTensile2, TubeTensile3 – For Diameter1, Diameter2, Diameter3
• What are your conclusions?
30 | MDT Confidential
15
ANOVA – Alternate Exercise Analyze this data two ways: 1) Assistant and 2) Stat>ANOVA Note: Stat>ANOVA assumes equal variances (and so may need tranformations), but Minitab Assistant ANOVA does no assume equal variances. An article in the IEEE Transactions on Components, Hybrids, and Manufacturing Technology (Vol. 15, No. 2, 1992, pp. 146-153) described an experiment in which the contact resistance of a brake-only relay was studied for three different materials (all were silver-based alloys). Alloy-Contact Resistance.MPJ
Alloy-Contact Resistance.MPJ
Test at a alpha = 0.01 level Does the type of alloy affect mean contact resistance?
Applied Statistics and Probability for Engineers, 4th Edition, Douglas C. Montgomery and George C. Runger
General Regression can be used for ANOVA
Use for multiple regression – more than one X
General regression can handle: 1) all continuous input(s), 2) all categorical input(s), 3) a mixture of continuous and categorical inputs, and 4) a non-normal response (it allows for the Box-Cox transformation of the response). The response must be continuous or considered as continuous.
16
General Regression: Example of ANOVA Note: A bloc ked One-way ANOVA is a tw o way ANOVA where on e factor’s effect is to be “ blocked out “ . The randomization is done within each block. Background: The forces exerted by three different stylets in a lead is compared at 4 different positi on/advancement cond itions (blo cks). The data is given below : Perform an ANOVA analysis using Stats>Regression>General Regression and determine if: (1) there are significant differences between different stylets, and if (2) the blocking factor employed was effective.
Condition is the Block
Condition 1 2 3 4 x
Force in Grams Stylet 1 Stylet 2 Stylet 3 18.1 14.5 14.0 20.0 16.1 16.3 30.2 27.5 26.8 42.5 39.4 38.7 27.70 24.38 23.95
stylet.MTW
Stylet.MTW
Blocked One-way ANOVA
x
17
Blocked One-way ANOVA
(1) (2)
Are there are significant differences between different stylets? Is the blocking factor employed effective?
SAMPLE SIZE FOR ANOVA
36 | MDT Confidential
18
Planning Sample Size in ANOVA
Sample Size for One-Way ANOVA Example • • • •
Fill in the number of levels for the factor Always fill in Standard Deviation (use conservative estimate) Then fill in two of the three long boxes Can specify several values, separated by spaces
19
Sample Size for One-Way ANOVA
RESPONDING TO ANOVA SIGNALS
40 | MDT Confidential
20
Statistical vs. Practical Significance • Key idea in any hypothesis testing effort – If the test detects a difference (a “signal”), then what? – Don’t assume the signal is automatically bad news (if you’re hoping for consistency) or good news (if you’re hoping for a change) • For example, “ANOVA Failure” in PQ
– Examine the size of the signal in the appropriate context . . . determine the “practical” significance of the difference – The appropriate response depends on an assessment of both statistical and practical significance
41 | MDT Confidential
ANOVA Signal in PQ • There was a realization that a significant p-value in the comparison of lot means should not necessarily mean the PQ fails • Analysis sometimes included to assess the “power” of the ANOVA and the practical significance of the difference in the means. • Eventually, Corporate Policy on Manufacturing Process Validation added the “ANOVA Failure Flow Chart”
42 | MDT Confidential
21
2008 Version of Corporate Guideline for Manufacturing Process Validation
43 | MDT Confidential
2012 Version of CRDM ANOVA Signal Flow Chart
44 | MDT Confidential
22
Pros and Cons • Pro – Provides a consistent way to address the question of practical significance – Relatively Simple – Effective – expect the approach to stand up to regulatory scrutiny
• Con – Can be very prescriptive – Standards for Ppk are quite high: 95% confidence bound on Ppk > 1.33 – Disincentive for larger sample size 45 | MDT Confidential
Current approaches • Corporate Guideline phased out • CV procedure still has essentially the same ANOVA Signal Flowchart • CRDM originally had a more prescriptive version • CRDM currently has a simplified version • Would also work to include a discussion of the sample size of the ANOVA and the practical significance of the difference • Discussion – other businesses?
46 | MDT Confidential
23
Example of ANOVA Signal Flow Chart • Recall the ANOVA exercise on Ch1DataFile.mtw for TubeTensile1, TubeTensile2, TubeTensile3
47 | MDT Confidential
ANOVA Signal Flow Chart Ppk Analysis First Stack the 3 lots using Data -> Stack -> Columns Then run Stat -> Quality Tools -> Capability Analysis -> Normal
Add confidence interval for Ppk using Options button 48 | MDT Confidential
24
Next steps • Total sample size is 90, so use confidence bound • Lower 95% confidence bound on Ppk is 0.92 • Must make 3 more runs – TubeTensile4, TubeTensile5, TubeTensile6 – These must pass tolerance interval analysis (like the first three runs did) – All six runs pass tolerance interval analysis
49 | MDT Confidential
Conclusion
Note: Ppk analysis of all six lots is not required. Included here FYI. 50 | MDT Confidential
25
Exercise: ANOVA Signal • Run ANOVA and assess practical significance for – In Ch1DataFile.mtw, analyze • WireTensile1, WireTensile2, WireTensile3 • Specification is 3 lb minimum
– Use one of the ANOVA Signal Flowcharts – Then use another approach to determine the practical significance of the difference between the means – Conclusion?
51 | MDT Confidential
ANOVA: Summary And Recap • Review Quality Trainer • Calculations Deep Dive into ANOVA • Analytically, ANOVA is a special case of Regression • Sample Size • ANOVA Signal Flow chart – some Medtronic divisions use one to standardize response to ANOVA Signal in PQ
52 | MDT Confidential
26
EQUIVALENCE TESTING
53 | MDT Confidential
Statistical Logic for Equivalence • The basic statistical logic is designed to disprove equality. – Null hypothesis: Two population parameters are equal, e.g. μ1 = μ2. – Alternative hypothesis: Two population parameters are not equal, e.g. μ1 ≠ μ2.
• We need a different form of logic to affirmatively prove equivalence. – Null hypothesis: Two population parameters differ by Δ or more, e.g. |μ1 - μ 2| ≥ Δ. – Alternative hypothesis: Two population parameters differ by less than ∆, e.g. |μ1 - μ 2| < Δ. 54 | MDT Confidential
27
Equality vs. Equivalence Part of the confusion around the issue of equivalence is that the concepts of equality and equivalence may not be distinguished. – Equality: Two values/processes are mathematically identical. – Equivalence: The difference between two values/processes is sufficiently small that it can be deemed practically insignificant.
55 | MDT Confidential
Approach 1: Confidence Intervals • The idea is to demonstrate that the confidence interval for the difference of i nterest is fully contained within the range of practical significance [-Δ,Δ].
56 | MDT Confidential
Jones, BMJ 1996
28
Approach 1: Confidence Intervals • Step 1: Define Practical Significance – Before collecting data, use scientific/engineering principles to decide what difference, Δ, is practically negligible.
• Step 2: Estimate Sample Size for Experiment – Based on characterization data or other assumptions, estimate the sample size needed to produce a confidence interval fully contained within [-Δ,Δ]. (Stat << Power and Sample Size << Sample Size for Estimation)
• Step 3: Collect Data and compute confidence interval. – If the confidence interval is a strict mathematical subset of [-Δ,Δ]. equivalence may be declared. If not, equivalence is either uncertain or untrue.
57 | MDT Confidential
Example of Approach 1 • •
Two processes will be declared equivalent if the difference in their mean outputs is less than 3 micrometers. So Δ=3. Based on characterization data,
– The old process can be modeled as Normal with a mean of 30 and a standard deviation of 2. – The new process can be modeled as Normal with a mean of 31 and a standard deviation of 1. – Based on mathematical theory, the distribution of (new – old) must also be Normal with a mean of 1 and a standard deviation of sqrt(5) = 2.24. – To be conserv ative in sampl e size estimati on, the standard deviation is ro unded up to 3. – With an expected mean difference of 1, we need the confidence interval to have a half-width (margin of error) of 2 or less. 58 | MDT Confidential
29
Example of Approach 1 Met hod Par amet er Di st ri but i on Standard devi ati on Conf i dence l evel Conf i dence i nter val
Mean Normal 3 ( esti mate) 95% Two-s i ded
Resul t s Mar gi n o f Er r o r 2
Sampl e Si z e 12
We need n=12 from BOTH processes.
59 | MDT Confidential
Example Output Two- sampl e T f or New vs Ol d N New 12 Ol d 12
Mean 30. 927 29. 19
St Dev 0. 858 1. 52
SE Mean 0. 25 0. 44
Di f f erence = mu (New) - mu (Ol d) Est i mate f or dif f erence: 1.735 95%CI f or di f f erence: ( 0. 671, 2.798) T- Test of di f f er ence = 0 ( vs not =) : T- Val ue = 3. 44
P- Val ue = 0. 003
DF = 17
• Conclusions: • The processes are statistically different (p=0.003), which is a statement about non-equality . • Despite being unequal, the processes are still equivalent. The 95% confidence interval for the difference in means is (0.671, 2.798), which is a strict subset of [-3, 3] 60 | MDT Confidential
30
Approach 1: Summary Summary • The confi confidence dence inte interval rval appro approach ach is the the gold gold standard for clinical trials and other high scrutiny experiments requiring FDA approval. • It is mathem mathematic atically ally equi equivale valent nt to a p-valu p-value-dri e-driven ven approach called TOST (Two One-Sided T-tests). • The confi confidence dence inte interval rval appro approach ach is is easier easier to understand than the original form of TOST.
61 | MDT Confidential
Post-hoc Problems • Rigorous Rigorous appli applicati cation on of appro approach ach 1 requi requires res that that the Δ value be established before collecting data. • What shou should ld we do when when data data have have alread already y been been collected without defining the difference of interest or planning sample size?
62 | MDT Confidential
31
Approach 2: 2: Retrospective Retrospective Power Analysis • When data hav have e already already bee been n collecte collected d without without planning for rigorous “equivalence testing”, equivalence may be assessed by displaying an entire power curve. • Eve Even n if this this approac approach h does not not set set a-priori a-priori standa standards rds for equivalence, – it provides additional context for an insignif icant p-value – it can help engineering experts to make dec isions
• Subject Subjective ive judgme judgment nt will will be required required to to determi determine ne if the experiment was suitably powered to demonstrate equivalence. • A powe powerr curve curve is a useful useful supp suppleme lement nt to a tradit traditiona ionall analysis, but it does not match the rigor in approach 1. 63 | MDT Confidential
Approach 2 Method Method • After After collectin collecting g the mean means s and stand standard ard deviati deviation on of the observed data, create a power curve through the Power and Sample Size platform in Minitab. • Display Display and and interp interpret ret the the Power Power Curve Curve in your your data analysis report. • You may may honestly honestly belie believe ve that that your experi experiment ment was sufficiently powered (>80%) to detect meaningful differences, differences, but the post-hoc nature of the analysis makes your argument weaker. weaker. 64 | MDT Confidential
32
Example • • •
Consider Consid er again again our our old and and new proces processes ses whic which h have 2 2 distributions of N(30,2 ) and N(31,1 ), respectively. Suppose Suppo se we forgot to take take approach approach 1 and instead instead just colle collected cted 5 data points from each process. We found found a stati statistic stical al diffe difference rence when we we collected collected 12 data data points, but the p-value goes above 0.05 when collecting only 5:
Two-sampl Two-s ample e T for New_5 vs Old_ Old_5 5 N
Mean
StDev
SE Mean
New_5
5
30.744
0.933
0.42
Old_5
5
29.42
3.02
1.4
Differenc Diffe rence e = mu (New_5) (New_5) - mu (Old_5) (Old_5) Estimate for difference: 95% CI for difference: difference:
1.32
(-2.61, 5.25)
T-Test of difference difference = 0 (vs not =): =): T-Value T-Value = 0.93
P-Value = 0.403
DF = 4
65 | MDT Confidential
Power Curve Inputs • The obs observ erved ed sam sample ple siz size e is n=5 • Desir Desired ed power power level levels s are in the the range range of .8-.9 .8-.95 5 • The pool pooled ed stand standard ard devi deviati ation on is 2.24. 2.24.
66 | MDT Confidential
33
Power Curve Output • With 80% pow power er,, this this exper experimen imentt could could have have detected a difference of about 4.5. • With 95% pow power er,, this this exper experimen imentt could could have have detected a difference of about 6. • It is a subjec subjective tive engin engineer eering ing judgmen judgmentt as to wheth whether er such values provide sufficient reassurance about the experimental results.
67 | MDT Confidential
Extensions and Challenges • Confiden Confidence ce interv intervals als and and power power curves curves can be calcu calculate lated d for almost any type of statistical scenario: – Comparing 2 means – Comparing >2 means – Comparing standard deviations devi ations – Comparing reliability curves • Howe However ver,, the requi required red sampl sample e size for for provin proving g equivalence of standard deviations is often much larger than the sample size for means. • Equ Equival ivalence ence for for means means can reason reasonabl ably y be quantif quantified ied in terms of arithmetic differences (e.g. |μ1 – μ 2| < 5), but equivalence for standard deviations will be quantified in terms of multiplicative differences (e.g. ½ < σ1/σ2 < 2). 68 | MDT Confidential
34
Exerci Exe rcise se – Les Lesion ion Depth Depth • •
Consider Consi der the the key requi requirement rement for a new ablati ablation on catheter: catheter: equivalent (or greater) maximum lesion depth, compared to the current design, where the difference of interest is 0.5 mm. Prev Pr evio ious us da data ta sh show ows s – – – –
• •
Normal distribution model is adequate for Max Lesion Depth Current Design has average max lesion depth of 2.3 mm New Design has average max lesion depth of 2.2 mm Largest pooled standard deviation of max lesion depth is 0.356.
Follow Approa Follow Approach ch 1 to plan plan sample sample size size for the equival equivalence ence test test Assume Assum e test test data data as as follows follows to complet complete e the the equivale equivalence nce analysis – New: n=15, mean = 2.733, stdev = 0.342 – Current: n=15, mean = 2.723, stdev = 0.386
•
Stat St ate e your your co conc nclu lusi sion on
69 | MDT Confidential
Alternate Exercise: Equivalence Testing Testing • Within Within your your team, team, iden identify tify an examp example le of equivalence testing in your own work. • Apply Apply Appr Approach oach 1, using using actual actual or made-up made-up characterization data for the the planning planning step. • Use Mini Minitab tab to simula simulate te data data coll collectio ection. n. – Hint: Use Calc Calc -> Random Random Data -> Normal . . .
• Use Minit Minitab ab to comp complete lete the Appr Approach oach 1 data data analysis. • Sta State te your your concl conclusi usion on from from the the data. data.
70 | MDT Confidential
35
EQUIVALENCE Take Away Messages • An insignificant p-value is not a rigorous method of proving equivalence. • Ideally, practical significance and sample size should be considered before the experiment begins. • Rigorously proving equivalence first demands carefully defining the threshold ( ∆) of practical significance. • The most rigorous way to prove equivalence is to demonstrate that a confidence interval is fully contained within [- ∆, ∆]. • An alternative—but less formal—approach is to retrospectively perform a power analysis. • Don’t feel like you need to remember all the Minitab steps; we hope you remember the concepts and call your neighborhood statistician for further support.
71 | MDT Confidential
Summary and Review • Quality Trainer Review • ANOVA – Assumptions – Using Minitab Assistant vs Stat Menu – Calculation Deep Dive – Sample Size – ANOVA Signals
• Equivalence Testing
72 | MDT Confidential
36
Chapter 2: Measurement Systems Analysis
Topics • Quality Trainer Review • Topics with Variables Data – Gage R&R Sample Size – Probability of Misclassification (Variables Data) – Helpful Hints
• MSA for Destructive Tests • MSA for Attribute Tests
2 | MDT Confidential
1
Quality Trainer Review
3 | MDT Confidential
Value of Measurement Systems Analysis If your goal is . . .
then MSA helps by . . .
Reducing variability in Xs and Process Improvement Ys so that the “key” Xs may be discovered. Capability Demonstration or Estimation
More accurate measurements of process performance
Sorting Out Bad Product
Reducing the Probability of Misclassification
Innovation
Reduced noise allows discovery of more subtle signals
4 | MDT Confidential
2
Recall . . . MSA Concepts •Bias – Mean (Delta – difference -- from reference) •Linearity – Mean (Bias vs Part or Operating Value) •Stability – Mean (Bias vs Time) •Repeatability – Standard Deviation •Reproducibility – Standard Deviation
…so linearity and stability should be plotted
•Gage R&R – Standard Deviation
…while bias, repeatability and reproducibility are just single numbers 5 | MDT Confidential
Gage Bias and Linearity • Bias is the difference between the average of repeated measurements and the “true value” • MSA tends to focus on Gage R&R (variability), but accuracy (= lack of bias) is equally important – Assumption that procedures for Calibration are in place - need to confirm – Assumption that procedures for Calibration are adequate – need to confirm
• “Linearity” is a study of bias across the range of measured values • In Minitab, use Stat -> Quality Tools -> Gage Study -> Gage Linearity and Bias Study 6 | MDT Confidential
3
Gage Stability MINITAB®
Snap Gauge.mtw
> Stat > Control Charts > Variables Charts for Subgroups > Xbar-R Measurement system is stable over time as evidenced by:
Xbar-R Chart of Rep1, ..., Rep3 0.254
UCL=0.253458
n a 0.252 e M e 0.250 l p m0.248 a S
_ _ X=0.2497
0.246
Xbar Chart - in control
LCL=0.245942
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 : : : : : 1 : 5 1 : 5 1 : 5 1 : 5 1 : 5 1 1 1 1 1 p p p p p p p p e e e e p e p S e S e S e S e S e S S S S S 8 9 0 1 2 8 9 1 1 2 1 1 0 1 1 1
Day 0.0100
UCL=0.00946
e g 0.0075 n a R e 0.0050 l
_ R=0.00367
p m0.0025 a S 0.0000
R Chart - in control
LCL=0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 : 0 : : : : 1 : 5 1 : 5 1 : 5 1 : 5 1 : 5 1 1 1 1 1 p p p p p p p p e e e e p e p S e S e S e S e S e S S S S S 8 9 0 1 2 8 9 1 1 2 1 1 0 1 1 1
Day
7 | MDT Confidential
GAGE R&R SAMPLE SIZE
8 | MDT Confidential
4
Gage R&R Sample Size • General recommendation: – 5 to 10 Parts (P) – 2 to 3 Operators (O) – 2 to 3 Repeats (R)
• More rigorous methods – Specify minimum Degrees of Freedom for estimating Repeatability and Reproducibility standard deviations – Use confidence intervals for standard deviation estimates (option provided in Minitab 16) 9 | MDT Confidential
Degrees of Freedom Approach • Estimating Reproducibility Std Dev: O-1 – Include as many operators as feasible
• Estimating Repeatability Std Dev: P*O*(R-1) – With 30 df, 90% confidence bound on ratio of estimate to true value is (0.79, 1.21). Ref: on www.minitab.com search for “ID 2613” to access “Minitab Assistant White Papers.”
CVG Test Method Validation
10 | MDT Confidential
5
PROBABILITY OF MISCLASSIFICATION
11 | MDT Confidential
Misclassification Two Misclassification Probabilities • Probability of Misclassifying Bad Unit as Good • Probability of Misclassifying Good Unit as Bad LSL
USL
Probability of Misclassifying Good Unit as Bad Unit Probability of Misclassifying Bad Unit as Good Unit
12 | MDT Confidential
6
MINITAB Simulated Estimation of Misclassification: Following Gage RR study
Part mean = 30, Part Std Dev = 10, Part Upper Spec = 40
No measurement system bias
Gage R&R Std Dev = 2.6 1) Calc/Random Data/Normal (simulate true part measurements)
2) Calc/Random Data/Normal (simulate gage variability)
13 | MDT Confidential
MINITAB Simulated Estimation of Misclassification (cont)
3) Calc/calculator/ use the “+” Add 1) + 2) to simulate observed measurements
4) Calc/calculator : assign a 1 for in spec for 1) Ex: (‘TrueMeasure’ ≤ 40)
14 | MDT Confidential
7
MINITAB Simulated Estimation of Misclassification (cont) 5) Calc/calculator : assign a 1 for in specs for 2) Ex: (‘ObsMeasure’ ≤ 40)
6) Stat/Table/Crosstabs to crosstabulate 4) and 5).
15 | MDT Confidential
MINITAB Simulated Estimation of Misclassification (cont) Estimated % of Truly Out of Spec called In Spec is 2.1%.
The simulation sample size was 10000. A larger sample size would be better.
16 | MDT Confidential
8
MINITAB Misclassification
17 | MDT Confidential
MINITAB Misclassification
Two problems: 1) Only three decimals for probabilities( i.e. 0.000) 2) Can’t enter historical: 1) process mean 2) part std.dev 3) gage std.dev (Note: (2) can now be done with a CSR work aid 13) 18 | MDT Confidential
9
Misclassification Using Minitab and Work Aid 13 CSRworkaid13 POM.mtw MINITAB®
Load into the worksheet: the Part mean (30) and the Part Sigma (10)
and the Gage Sigma (2.6)
19 | MDT Confidential
MINITAB Misclassification
20 | MDT Confidential
10
MINITAB Misclassification Enlarging the label on the sample mean chart, we see the mean is 30.
21 | MDT Confidential
MINITAB Misclassification Examining the output we see that: USL 40, and the Part Sigma (10) and the Gage Sigma (2.6) . Prob. of a truly bad part called good is .021
22 | MDT Confidential
11
Probability of Misclassification (POM) Tool • Originally written in R by Tarek Haddad to recreate functionality lost when Medstat was retired. • Jim Dawson collaborated with Tarek to continue development and turn it into an Excel tool. • A substantial Software Validation effort was undertaken by Nick Finstrom and Barry Christy, with the support of Pete Patel and the CVG Test Method Council. Validation work to be completed in early 2014. 23 | MDT Confidential
POM Tool • • • • •
Replicates Medstat functionality More resolution in results than Minitab Graphics Guardbanding Normal, Lognormal and Weibull distributions of parts
24 | MDT Confidential
12
POM with Guardband
25 | MDT Confidential
Exercise • Run POM analysis – Using Minitab Simulation – Using Work Aid 13 and Minitab GRR – Using POM Tool
26 | MDT Confidential
13
HELPFUL HINTS
27 | MDT Confidential
Gage R&R Helpful Hints - Normality • Normality testing is not needed for Gage R&R analysis – Distribution of the raw data will depend strongly on the parts used in the study – there no expectation or assumption that the raw data will to follow any specific distribution – Repeated measurements on the same part by the same operator will likely follow a normal distribution • Like any ANOVA model, the residuals are assumed to follow a normal distribution – but the analysis is relatively “robust” to non-normality of the residuals
– Probability of Misclassification does depend on the part or process distribution (each part measured once)
28 | MDT Confidential
14
Gage R&R Helpful Hints – One-Sided Specification • In the case of a one-sided specification, the Percent Tolerance metric depends on the part average • Minitab uses the overall average in the Gage R&R study as the estimate of the part average • If the parts used in the study are not representative of the expected part distribution . . . – The overall average will be a poor estimate of the process average – The percent tolerance result will be misleading – Best practice would be to calculate Percent Tolerance separately using a better estimate of the process average – Being “not representative” can be a good practice – for example, including parts that don’t meet the specification
29 | MDT Confidential
Corrective Actions for Failed Gage R&R • Repeatability problem – Could be due to part positional variation • Standardize by measuring same position on each part • Or make multiple measurements at random or systematic positions and use the average
– If gage itself is too variable, may need to improve or replace • In the meantime, Repeatability variability can be filtered out by taking repeated, independent measurements and using the average. Note that this approach does not correct for Reproducibility issues.
30 | MDT Confidential
15
Corrective Actions for Failed Gage R&R • Reproducibility Problem – Look for assignable causes that explain the operator-to-operator differences – Understand any Operator*Part interactions – these may provide clues to differences in technique. – Possibly improve the measurement procedure and/or re-train the operators – Improve any visual aids or samples used in the measurement procedure
31 | MDT Confidential
Approaches to Robust Gage R&R
Standard Gage R&R methods assume that other factors that affect measurements have been studied and controlled in the development of the test method.
If these sources of variability still affect the measurements, then . . .
The Expanded Gage R&R allows you to add additional factors. Besides operator & part, you could add fixture number, gage number or other factors. The Expanded GRR can also handle missing data.
Reference: “Make Your Destructive, Dynamic, and Attribute Measurement System work for you” by William Mawby.
This book includes the Analysis Of Covariance method that allows one to load in the varying environmental factors like temperature & humidity (covariates) into a GRR.
The General Linear Model in Minitab (under the ANOVA branch) can be used to model covariates (also handles missing data).
32 | MDT Confidential
16
MSA FOR DESTRUCTIVE MEASUREMENTS
33 | MDT Confidential
Two Types of Destructive Measurements 1. Truly destructive: Measurement destroys unit being measured Pull test
In neither case is it possible to take repeated measures, so gage R&R is not possible.
Peel test Tensile test
2. Non-replicable: Measurement process can change the unit or you are measuring a transient phenomena Catapult distance Motor speed Heart rate Dimension of silicon part (can compress) Dimensions of heart tissue (can compress) Ref: Make Your Destructive, Dynamic, and Attribute measurement System Work for You. by. W. D. Mawby
34 | MDT Confidential
17
Approaches to Destructive MSA App ro ach Develop a non-destructive measurement
Pro
Con
Ideal solution
Often difficult or impossible
Easy to apply usual Minitab calculations
Rarely works because parts aren’t actually identical
Use a coupon test so that parts are more identical
Results better than above
Coupons may not be representative – easier to measure than real parts
Focus on improving the measurement process using DMAIC
Proven methodology Cannot conclude whether measurement system is adequate
Focus on Reproducibility
Not affected by partto-part variability
Attempt to use identical parts as “repeat” measurements and apply usual requirements for GRR %Tolerance
Might miss a Repeatability issue
35 | MDT Confidential
What about using “Nested” Gage R&R? •
The “nested” Gage R&R analysis applies when one operator measures different parts than another operator. – For example, John measures parts 1, 2, 3, 4, 5 repeatedly and Jane measures parts 6, 7, 8, 9, 10 repeatedly. – Common application would be “Inter-laboratory Testing,” where operators at each location measure different parts repeatedly. – Can work for Destructive MSA if each homogeneous sample may be sub-sampled. Then operators can measure different samples repeatedly.
•
Analysis – The nested analysis does not include a term for Part * Operator interaction. – Note that Minitab Assistant doesn’t offer the Nested analysis
•
Unless sub-sampling of homogeneous material is possible, Nested does not solve the key problem of Destructive MSA – It’s impossible to repeat the measurement
36 | MDT Confidential
18
Destructive Gage R&R Example MINITAB®
TestingSupplierCoils.mtw
Tensile
8
testing of tubing
pieces of tubing
Each
tubing cut into 2 sub samples
Assume
variation between sub samples due to measurement error
Assume
an upper specification of
850 g 37 | MDT Confidential
Destructive Gage R&R using sub-samples
38 | MDT Confidential
19
Destructive Gage R&R using sub-samples
39| MDT Confi denti al
Destructive Gage R&R using sub-samples
Nearly all measurement system variation due to repeatability rather than operator (reproducibility).. . . Or maybe sub-sample differences?
Large result for % Tolerance
Measurement system does not distinguish one part from another within the range of parts used in the study 40 | MDT Confidential
20
Destructive Gage R&R using sub-samples Destructive
Gage R&R using subsamples gave poor results
Since
repeatability accounts for most of the apparent measurement variation it is likely that parts were not very similar
In
this project they used DMAIC Process Knowledge method to improve system without obtaining a formal measurement
41 | MDT Confidential
Focus on Reproducibility • With destructive measurements, the Repeatability Standard Deviation always includes the part-to-part or subsample-to-subsample variation. In general, repeatability standard deviation cannot be accurately estimated. • If one population of parts is randomly assigned to multiple operators, then the Reproducibility Standard Deviation is not affected by part-to-part variation. • Reproducibility standard deviation can be estimated accurately even for destructive tests. 42 | MDT Confidential
21
Reproducibility • Stop – Trying to force (Repeatability + Part) Standard Deviation to be small enough to meet a requirement. – Trying to obtain or create “ident ical” parts.
• Start – Estimate Reproducibility standard deviation and ensure that it is small enough. This standard deviation depends only on the differences between operator means. – Compare operator standard deviations. Identify cases where operators show substantially different variation across equivalent sets of parts. 43 | MDT Confidential
Example: CVG Test Method Validation for Destructive Tests • Obtain a population of 40 parts – Do not need to get identical or nearly identical parts
• Randomly assign 10 parts to each of 4 operators • Calculate %Tolerance for Reproducibility – Compare to requirement of 25%
• Calculate Std Dev Ratio – Compare to simulation-based critical values (for typical study, critical value is 3.10 44 | MDT Confidential
22
Example Calculations • Data based on actual TMV studies – But altered to disguise – Detection Time A, Detection Time P
45 | MDT Confidential
Detection Time A
46 | MDT Confidential
23
Run One-Way ANOVA
•
Reproducibility
= sqrt((0.778-0.627)/10) = 0.123
47 | MDT Confidential
Calculate Results
• % Tolerance (Reproducibility) = 100 * ((6*0.123)/2*(30-11.740)) = 100 * (.738 / 36.52) = 2.02% • Std Dev Ratio = 0.986 / 0.546 = 1.81 • Result: Pass 48 | MDT Confidential
24
Detection Time P
49 | MDT Confidential
Calculations for Detection Time P •
Reproducibility
= sqrt((11.225-0.976)/10) = 1.01
• % Tolerance (Reproducibility) = 100 * ( (6*1.01) / 2*(30-14.798) ) = 100 * (6.06 / 30.40) = 19.9% • Std Dev Ratio = 1.113 / 0.846 = 1.32 • Result: Pass
50 | MDT Confidential
25
Exercises • Open Destructive Exercises.mtw • For Bond Strength results: – Assume specification is Minimum 5 lb – Analysis • Individual Value Plot • % Tolerance for Reproducibility • Std Dev Ratio • Is this destructive measurement system adequate?
• Repeat for Buckle Force results – Assume specification is Maximum 340 grams 51 | MDT Confidential
MSA FOR ATTRIBUTE MEASUREMENTS
52 | MDT Confidential
26
ATTRIBUTE GAGE R&R • Attribute data are usually the result of human judgment –
Which category does this item belong in?
• When categorizing items, you need a high degree of agreement on which way an item should be categorized • The best way to assess human judgment is to have all operators repeatedly categorize several known test units (Attribute Gage R&R) –
–
Look for agreement •
each person categorizes the same unit consis tently
•
there is agreement between the operators on each unit
Use disagreements as opportunities to determine and eliminate problems
53 | MDT Confidential
SETTING UP AN ATTRIBUTE GAGE STUDY • Most important aspect of attribute Gage Study is
selecting parts (representative defects) • Most challenging aspect is choosing parts for the study. Typically use . . . – 50% acceptable parts – 50% defective parts
• Have operators repeatedly classify parts in random order without knowledge of which part they are classifying (blind study)
54 | MDT Confidential
27
Analysis of Attribute Gage R&R • Stat Quality Tools Analysis
Attribute Agreement
– Percent Agreement based on number of Parts – Kappa Statistics (range -1 to 1)
• Minitab Assistant Analysis
Measurement System
– More graphical output – Accuracy statistics based on number of Appraisals – No Kappa statistics 55 | MDT Confidential
Use Minitab Assistant -> Measurement Systems Analysis (MSA)
28
Create Attribute Agreement worksheet
Create Attribute Agreement worksheet
29
Create Result Data • • • • • • • •
Choose Number of Appraisers = 3 Choose Number of Trials = 2 Choose Number of Test Items = 10 Items 1-5 are “Good”; Items 6-10 are “Bad Click “OK” Copy column “Standards” and paste into “Results” Fix column name back to “Results” Find first trial of Item 1 and Item 2 – Change result from “Good” to “Bad” to inject two errors into the simulated study
• Save onto Desktop as “Attribute GRR”
Attribute Agreement Analysis
30
Summary Report Attribute Agreement Analysis for Results Summary Report Is the overall % accuracy acceptable? < 50%
Misclassification Rat es 100%
No
Yes 96.7% The appraisals of the test items correctly matched the standard 96.7% of the time.
100
3.3% 6.7% 0.0% 6.7%
Comments
% Accuracy by Appraiser 120
100.0
Overall error rate Good rated Bad Bad rated Good Mixed ratings (same item rated both ways)
100.0
96.7%
90.0
80
60
40
Consider the following when assessing how the measurement system can be improved: -- Low accuracy rates: Low rates for some appraisers may indicate a need for additional training for those appraisers. Low rates for all appraisers may indicate more systematic problems, such as poor operating definitions, poor training, or incorrect standards. -- High misclassification rates: May indicate that either too many Good items are being rejected, or too many Bad items are being passed on to the consumer (or both). -- High percentage of mixed ratings: May indicate items in the study were borderline cases between Good and Bad, thus very difficult to assess.
Attribute “c=0” result . . . Showing that no bad parts were misclassified as good Overall, 96.7% of presentations were classified correctly
20
0
Appraiser 1
Appraiser 2
Appraiser 3
61 | MDT Confidential
Accuracy Report Attribute Agreement Analysis for Results Accuracy Report All graphs show 95% confidence intervals for accuracy rates. Intervals that do not overlap are likely to be different.
Illustrates the 95% / 90% result
% by A ppraiser and Standard
% by A ppraiser
Good
Appraiser 1 Appraiser 1 Appraiser 2
Appraiser 3
Appraiser 2 40
60
80
100
% by Standard Appraiser 3
Good Bad
Bad 40
60
80
100 Appraiser 1
% by Trial
1
Appraiser 2
2
Appraiser 3
40
60
80
100
40
60
80
100
31
Kappa
Kappa is a measure of rater’s agreement.
Minitab:
• •
Reports two Kappa statistics: Fleiss’ & Cohen’s Defaults to Fleiss’ Kappa Minitab will only calculate Cohen’s Kappa if you choose the option for Cohen’s Kappa, and if one of these two conditions is true:
• •
A) Two appraisers perform a single trial on each sample B) One appraiser performs two trials on each sample
Kappa is meant for attribute data.
Kappa ranges from -1 to 1.
63 | MDT Confidential
Kappa (Landis and Koch)
Acc or di ng to AIA G (Auto in du str y), a gen eral ru le of thu mb i s:
A Kap pa val ue g reater t han 0.75 ind icates a g oo d to excell ent agreement Kappa values less than 0.40 indicate poor agreement.
This general rule of thumb may not apply for most Medtronic applications. Any disagreement on rejectable units would be of concern. 64 | MDT Confidential
32
Kappa calculations
65 | MDT Confidential
Kappa results
66 | MDT Confidential
33
Summary and Recap • Quality Trainer Review • Topics with Variables Data – Gage R&R Sample Size – Probability of Misclassification (Variables Data) – Helpful Hints
• MSA for Destructive Tests • MSA for Attribute Tests
67 | MDT Confidential
BACKUP SLIDES
68 | MDT Confidential
34
Destructive Gage R&R - 2 Nested Designs Stage 1 1 Operator
•2 Stage Nested Desig n App ro ach
Parts •Samples are parts that can be subdivided into homogenous sub samples.
Location
1
1 2
•Stage 1: 1 operator measures sub-samples (2-5) from parts (5-10). •Stage 2: 3 operators each measure same location per part (5-10).
2 5
1 2
10
5
1 2
5
Stage 2 1 sub-sample per part Operator Parts
2
1
1 2
10
1 2
3
10
1 2
10
69 | MDT Confidential
Destructive Gage R&R - 2 Stage Die Bond Example (cont.) •Project:
MINITAB®
Destructive 2 stage nested.mpj
Pull
testing of die bond. Parts are die. Sub-samples are 5 wire locations on the die. Spec = 7.5 grams minimum. Stage1:
1 operator pull tests all 5 wire locations on each of 10 die. Stage
2: Each of 3 operators pull test 10 die at wire location 1. 70 | MDT Confidential
35
Destructive Gage R&R - 2 Stage Die Bond Example (cont.) Stage
1: Stat > ANOVA > Full y Nest ed ANOVA
From worksheet: stage1
2 part
Nested ANOVA: Pull Strength versus Die Var i ance Component s
Sour ce Di e Err or Tot al
Var Comp. 0. 088 0. 479 0. 567
ˆ
% of Tot al 15. 50 84. 50
St Dev 0. 296 0. 692 0. 753
71 | MDT Confidential
Destructive Gage R&R - 2 Stage Die Bond Example (cont.) Stage
2: Stat > ANOVA > Full y Nest ed ANOVA
From worksheet: stage2
Nested ANOVA: Pull Strength (Wire 1) versus Operator Var i ance Component s ˆ2
operator
Sourc e Oper at or Err or Tot al
Var Comp. 0. 053 0. 428 0. 481
% of Tot al 11. 08 88. 92 2 part / repeat
St Dev 0. 231 0. 654 0. 694
ˆ 72 | MDT Confidential
36
Destructive Gage R&R - 2 Stage Die Bond Example (cont.) Manual
calculation of Gage Repeatability and Reproducibility
2 2 ˆ repeat = part / repeat
ˆ
ˆ
2
part
= 0.428 – 0.088 = 0.340 2
R& R
= 0.340 + .053
= 0.393
Compare Gage R&R variance to part variance if parts are chosen to be representative of production process. Since this is a one-sided spec (7.5 grams) use Misclassification to determine gage acceptance. 73 | MDT Confidential
Kappa – Call Center Example
Call Center workers were asked to categorize types of calls they received: Callcat.mtw MINITAB®
74 | MDT Confidential
37
Kappa Attribute Analysis: Option Setting
75 | MDT Confidential
Kappa : Within Appraiser Agreement
76 | MDT Confidential
38
Kappa: Each Appraiser vs Standard
77 | MDT Confidential
Kappa for Appraisers
What do we conclude from this analysis for the raters performance?
What would you do next?
Can this method be applied to the banana data?
78 | MDT Confidential
39
Distribution Analysis The Art of Finding Useful Models Jeremy Strief, Ph.D. MECC Principal Statistician
Objectives • Explain why distributional analysis is statistically complicated (and sometimes emotionally frustrating!) • Emphasize the importance of engineering theory and historical precedent. • Encourage the use of multiple graphical methods in addition to numerical tests. • Review common causes of Non-Normality. • Discuss Transformations and how they compare to fitting non-Normal distributions.
MedtronicConfidential
1
Recap from Quality Trainer • Normal Distributi on Basics • Capability Analysis (Normal) • Capabilit y Analysis (Non-Normal) • Graphical to ols – Boxplots – Histograms – Individual Value Plots
3 | MDT Confidential
Distribution Analysis Motivation and Philosophy
2
Why Assess Distribution • Statistical tools vary in sensitivity to and effect of distributional assumptions • Some MDT procedures require distributional assessment for those statistical methods which are highly sensitive to distributional assumptions Statistical Tool
Distributional Sensitivity Effect of Poor Distributional Fit
Capability Analysis Tolerance Intervals Variables Lot Acceptance Sampling Individuals Chart for SPC GLM/ Regression / ANOVA Xbar chart for SPC Two‐sample t‐test Non‐parametric methods
High High High High Med Med/Low Low Low
Incorrect PPM/Ppk Incorrect Bounds Altered rejection and acceptance rates Incorrect control limits approximate p‐value approximate p‐value approximate p‐value approximate p‐value
5 | MDT Confidential
Not All Data Are Normal: Example Histogram of Time 40
30 y c n e u q e r F
20
Lead Time Data usually have a long tail – skewed distribution
10
0 0
10
20
30 Time
40
50
Proba bility Plot of Time Norm a l 99.9
M ean S t De v N A D P -V alu e
99 95 90 t n e c r e P
12.31 9 .6 56 100 5.7 38 < 0.005
80 70 60 50 40 30 20 10 5 1 0.1 -2 0
-1 0
0
10
20
30
40
50
60
Time
6 | MDT Confidential
3
Not All Data are Normal: Considerations • Observed data need not follow any tractable mathematical model. • Some mathematical models may be useful, if imperfect, representations of the data.
7 | MDT Confidential
Frustrations with Distributional Analysis
• Larger sample sizes (n>100) cause the statistical tests to detect small departures from a theoretical model. Such departures may not be practically significant. • Smaller sample sizes (n<15) often yield multiple distributions with p-values greater than 0.05. Graphs may look sparse and thus may not narrow one’s choice of distribution. • Note: for both cases the data needs to come from a process in control.
9 |Medtronic Confidential
4
The Underlying Statistical Hypotheses •
The statistical hypothesis testing is ‘backward,’ in that the null hypothesis assumes that the particular distribution is a good fit. – H0: Distribution specified has a good fit – H1: Distribution specified has lack-of-fit
• •
•
Low p-values will disprove the fit of a distribution. So certain distributions can be ruled out as a reasonable models. Using the standard goodness-of-fit metrics, it is technically not possible to prove that a particular distribution is the “true model” for the data. Instead of providing statistical “proof”, distribution analysis is geared toward assessing which statistical distributions are plausible models for the data at hand.
9 | MDT Confidential
Philosophy of Distribution Analysis
“All models are approximations. Essentially, all models are wrong, but some are useful. However, the approximate nature of the model must always be borne in mind.” --G.E.P. Box
10 | MDT Confidential
5
N=15 Probability Plots
MedtronicConfidential
N=500 Examples
Only 12 out of 500 values were affected by the truncation or censoring. MedtronicConfidential
6
How to Determine Distribution Priority order
1. Scientific/Engineering Knowledge 2. Historical distribution analysis 3. Distribution analysis
Why is distribution analysis last?
• Sample size (50 to 100) • Regardless of n, key Xs and shift and drift can mask true distribution
Distribution applies to short term data only
13 | MDT Confidential
Importance of Engineering Theory • The choice of distribution should be both statistically plausible and scientifically justified. • Engineering theory and historical precedents often suggest whether a distribution should be Normal, Lognormal, or Weibull. • If scientific theory does not lead to one single statistical model, at least consider – Whether the distribution should be skewed or symmetric – Which distributions can be ruled out
MedtronicConfidential
7
Data Analysis Philosophy • Information shouldn’t be destroyed. Examples of information destruction are – Converting variables data to attribute data. – Heavy rounding with a bad measurement system. – Drifting measurement system.
• Check the quality and structure of the raw data. – Are there physically impossible values, wild outliers, missing values, too many ties? – Are the data paired or unpaired? – Was randomization employed? – How was the data generated? 15 | MDT Confidential
Data Analysis Philosophy • Plot the data AND do analytics. – PLOT histograms, run charts, scatter plots,… . See what is going on. Do a probability plot for process data. – Use ANALYTICS to get quantitative about what you have seen. Examine the residual plots from analytical model fits.
• Analyses are performed on yesterday’s data today to predict tomorrow’s performance. – Data from an unstable process that is analyzed (ignoring the instability) may result in a conclusion that will not hold up tomorrow. 16 | MDT Confidential
8
Distribution Analysis Review of Engineering Distributions
Most Common Statistical Models for Engineering Applications
• Weibull • Exponential (special case of Weibull) • Lognormal • Normal
18 | MDT Confidential
9
Weibull •
A flexible model which can assume many different shapes, depending on the choice of parameters
•
Scale parameter α or η
•
Shape parameter β
•
Arises from “weakest link” failures, or situations when the underlying process focuses on the minimum or maximum value of independent, positive random variables.
•
Models stress-strength failures
19 | MDT Confidential
Exponential •
Special case of Weibull when β=1
•
Constant hazard rate, meaning that the probability of failure is not a function of the age of the device/material.
•
May occur when multiple failure modes are operating simultaneously
•
May be useful in modeling software failures resulting from external sources (e.g. cosmic radiation causes bit-flips at an extremely low, constant rate)
20 | MDT Confidential
10
Lognormal •
Models time-to-failure caused by several forces which combine multiplicatively.
•
Describes time to fracture from fatigue crack growth in metals.
•
Right skewed distribution, useful when data values take multiple orders of magnitude (e.g. 1.4, 14, 140).
•
Two parameters (μ,σ), each of which is traditionally expressed on the log scale.
•
So if X~Lognormal(μ,σ), then ln(X)~Normal(μ,σ)
21 | MDT Confidential
Normal •
Models time-to-failure caused by additive, independent forces
•
Commonly describes gage error, dimensional measurements from a supplier, and other symmetric, bell-shaped phenomena
22 | MDT Confidential
11
Additional Models to Consider • Logistic • Smallest Extreme Value (SEV) • Largest Extreme Value (LEV)
23 | MDT Confidential
Some Relationships • SEV distribution = ln(Weibull distribution). • LEV distribution = ln(1/Weibull distribution). • Normal distribution = ln(Log-normal distribution). • All Weibull distributions can be rescaled and repowered to get another Weibull. • The Weibull(100,4) is very close to a Normal (mean=90.64, s.d= 25.43). This normal is thicker in the tails than the Weibull (100,4). Ref: 02SR013 “Algorithm for Computing Weibull Sample Size for Complete Data”
24 | MDT Confidential
12
Review: Common Engineering Distributions Weibull
Normal Wearout
Default
Time to stress/strength related failure Measurement error
Infant mortality
Dimensions
Lead Time Time to fatigue related failure
Lognormal 25 | MDT Confidential
Distribution Analysis Statistical Overview
13
Statistical Approach to Distribution Analysis • Both graphical and numerical approaches are needed • P-value is not definitive, given the “backward” nature of hypothesis testing • Visual assessment of the probability plot is crucial • Reasonably large sample sizes (~50) are needed. Consult your local procedures (e.g. DOC000550 within CRDM) for specific rules.
27 | MDT Confidential
Distribution Analysis Graphical Methods
14
Good Distribution Analysis Should Always Begin With Plots!
• Probability plots • Histograms • Time plots
MedtronicConfidential
Probability Plot •
A probability plot is a 2-dimensional plot with specialized (often logarithmic) axes, to facilitate comparison between observed data and a hypothesized distribution.
•
More specifically, a probability plot is a comparison between the observed and theoretical quantiles (i.e. percentiles) for a hypothesized distribution.
30 | MDT Confidential
15
Probability Plot Interpretation •
If the distribution i s a good fit to the data, the plotted points should fall approximately in a straight line.
•
When interpreting the probability plot, examine both the p-value and the visual fit. – At the tails of the distribution, look whether the points are falling on the conservative side of the fitted line. – Look for major deviations in the pattern of points from a straight line—kinks, ties, curves, jumps, etc. Do not worry if a few points fall outside the confidence bounds. – Fat Pencil Test: Can the observed data values be covered up by a “fat pencil”?
31 | MDT Confidential
Probability Plot in Minitab
32 | MDT Confidential
16
Probability Plot Examples Right skew and curvature:
Large N makes for obvious curvature:
MedtronicConfidential
Probability Plot Examples “Subtle Patterns” can be caused by randomness
Both datasets were sampled directly from a Normal distribution. MedtronicConfidential
17
Probability Plot Examples •
Distribution does not pass the Anderson-Darling test, but the lower tail of the distribution falls on the conservative side of the fitted line.
•
Distribution appears to have a lower limit of zero
•
It would be conservative to use the Normal model to estimate the lower tail behavior.
35 | MDT Confidential
Histograms in Minitab The graph menu offers a histogram platform, but the graphical summary platform offers more information with fewer clicks.
36 | MDT Confidential
18
Histograms •
More intuitive than probability plots, since the x-y axes are not transformed.
•
Not informative with small sample sizes (<30)
•
Can theoretically be misleading if the bin width is calculated inappropriately, but in practice the histogram is a useful tool for moderate-to-large sample sizes Apparent right skew
Approximately Bell-Shaped
37 | MDT Confidential
Time Plots •
Fitting a single distribution to your data implies that the underlying process is stable.
•
Without a stable process, distributional fit is irrelevant.
•
Time plots and control charts help evaluate the stability of your process.
38 | MDT Confidential
19
Why is Stability needed to Assess Distribution? MINITAB®
Distribution Analysis Shift and Drift.mtw
Distribution Assessment Risks •
Shift and Drift, and Variation in Key Xs masks distribution
•
Initial capability data always contains Shift and Drift
•
At Final Capability, process is stable and variation in Key Xs is removed
100 samples from Week 1 25 samples from Week 2 100 samples from Week 3
Distribution applies to short term data only
39 | MDT Confidential
Initial Process Data often have Shift and Drift I Chart of Initial Capability Data 1
35
1 1 1 11 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 11 1 1 1 1 1 1 111 1 1111 111 1 111111 1 1 11 11 11 11 1 1 1 1 1 1 11 1 1 11 1 11 1 1 11 1 1 1 1 1 11 1 1 1
1
30 e u l a V l a u d i v i d n I
25
_ X=19.93
20 15 10 5
UCL=26.30
LCL=13.55
1 11 1 1 11 1 1 1 11 1 1 1 1 1 1 111 1 11 1 1 1 1 1 1 1 1 1111 111 111 1 1 1 1 1 1 11111 11 1 11 11 1 1 1 1 1 11 1 1 11111 1 1 11 1 11 11 1 11 1 1 11 1 1 11 1 1
1
23
45
67
89 111 133 Observation
155
177
199
221
40 | MDT Confidential
20
Long Term Data May not be Normal
Probability Plot of Initial Capability Data Normal - 95% CI
99.9 99
Combined Data is not normal
95 90 t n e c r e P
80 70 60 50 40 30 20
Mean S tD ev N AD P-Value
10 5
19.93 9 .6 79 225 13.617 <0.005
1 0.1
-20
-10
0
10 20 30 Initial Capability Dat a
40
50
60
41 | MDT Confidential
But Short Term Data Could be Normal
Probability Plot of Initial Capability Data Normal - 95% CI
99.9 99 95 90 t n e c r e P
80 70 60 50 40 30 20
Week 1 2 3
Mean 9.871 20.39 29.87
10 5 1
Each week is normal
StDev N AD P 2.155 100 0.476 0.233 2.203 25 0.280 0.616 2.011 100 0.236 0.785
0.1
0
10
20 30 Initial Capability Dat a
40
42 | MDT Confidential
21
Distribution Analysis Numerical Methods
Numerical Methods • For all numerical methods: – A large (≥0.05) p-value implies there is no evidence against the hypothesized distribution. – A small (<0.05) p-value implies there is statistically significant lack-of-fit.
• It is commonly stated that a distributional test “passes” when p ≥0.05. • A passing test does NOT mean that the hypothesized distribution is “correct” or “best.” There may be multiple models which fit the data, and you should choose whichever model best matches science and historical precedent.
44 | MDT Confidential
22
Most Common Normality Tests • An der son-Darlin g (AD) test • Ryan-Join er test Note: The Ryan-Joiner test is essentially equivalent to the Shapiro-Wilk test.
45 | MDT Confidential
Anderson-Darling • Default approach in Minitab. • May be used to assess fit of Normal and nonNormal distributions. • Gives unreliable results when data are discretized/grouped, which is fairly common when measurement system resolution is poor.
46 | MDT Confidential
23
Anderson-Darling in Minitab For assessing Normality:
47 | MDT Confidential
Anderson-Darling in Minitab For any/all distributions:
48 | MDT Confidential
24
Anderson-Darling Results Normal(10,1.5)
Normal(10,1.5)--Rounded
49 | MDT Confidential
Ryan-Joiner •
Useful for discretized, rounded, or clumpy data
•
Will not declare significant lack-of-fit simply due to poor measurement resolution
•
Recommended minimum of 5 “groups” to have a meaningful pvalue. Fewer groups may yield an overly optimistic (high) pvalue.
Anderson-Darling
Ryan-Joiner
50 | MDT Confidential
25
Ryan-Joiner in Minitab
51 | MDT Confidential
Truncation • The Normal distribution may be used to model tail behavior if it provides a conservative estimate of those tails. • This situation arises when data are truncated, which is quantitatively captured as negative kurtosis.
52 | MDT Confidential
26
Truncation • In principle, truncated data may be evaluated graphically or through a Skewness-Kurtosis (SK) test. • The SK test checks whether the tails of the Normal distribution are longer or shorter than the tails of your data. • MECC has created and validated an Excel spreadsheet (R134997) which executes the SK test. • In practice, consult your local procedures to ensure your analysis of truncated data is compliant.
Microsoft Excel Worksheet 53 | MDT Confidential
Avoiding Parametric Distributions Altogether • Chebyshev’s inequality captures the tail behavior of any statistical distribution with a finite variance. – For any random variable X and constant k > 1, P( |X- μ | ≥ k σ ) ≤ 1/k 2
• This inequality may be useful for skipping the issue of distributional fit altogether, especially if distributional fit is being assessed in order to compute a tolerance interval. • Chebyshev’s will only be helpful if the process capability is extremely high. • Consult your own procedures for details, but CRDM procedures invoke the following version of Chebyshev: – If the nearest specification is at least 10 standard deviations away from the mean, it may be inferred by Chebyshev that at least 99% of the distribution will fall within specification.
54 | MDT Confidential
27
Why Normality Tests Fail 1. A shift occurred in the middle of the data 2. Multiple sources or multiple failure modes with different distributions 3. Outliers 4. Piled up data. 5. Truncated data (sorted before you get it) 6. The underlying distribution is not normal (skewed) 7. Poor measurement resolution 8. Too much data (over powered to detect nonnormality) 9. Due to random chance –you expect the test to fail 5% of the time (i.e. 95% confidence) if the data were truly from a normal distribution.
Resolving Non-Normality 1
Data shift
Sublot Skewness/kurtosis test Attribute sampling
2
Multiple data sources
Sublot Skewness/kurtosis test Attribute sampling
3
Outliers
Attribute sampling Outlier removal (May remove outliers only if they constitute typos or data collection errors.)
4/5
Censored/Truncated data
Skewness/kurtosis test
(tails lost)
Conservative fitting Attribute sampling
6
Distribution not normal
Non‐normal analysis Transformation Attribute Sampling
7
Poor measurement resolution
Ryan‐Joiner
8
Too much data
Graphical evidence
9
Random Chance
Skewness/kurtosis test Randomsubsampling Historical assessment
28
When Multiple Distributions Fit Prior engineering knowledge is particularly useful when multiple distributions yield p-values above 0.05: – Picking the distribution solely based on best p-value or best R2 is rational when there is absolutely no history or scientific theory. – A better approach is to assemble a list of plausible (p>0.05) distributions and then make a final choice based upon history and science. – P-values will sometimes be below 0.05 simply as a result of chance (Type I error). It is not recommended to immediately change years of analysis based on one significant p-value. Investigate and monitor before changing distributions.
57 | MDT Confidential
Avoid the daily special – Do NOT take the “distribution du jour” approach, in which multiple distributions are chosen for a single process. This reflects either: • An out-of-control process, which can’t be captured by a single distribution anyway. • The bad statistical practice of just defaulting to the distribution with the highest p-value.
58 | MDT Confidential
29
Example: Capability for Non-Normal Data using Tribal Knowledge for Distribution MINITAB®
LoanApplicationTime.MTW
Problem Statement: Time (in days) to process (reject/accept) loan applications is too long causing loss in customer applications Project Goal: Decrease potential customer loss from 15% to 5%. Customer expectation is 20 days. Project Strategy: Path Y = Time Task: Determine capability for Y = Time
Assume lead time has a LogNormal Distribution
59 | MDT Confidential
Verify Lognormal Distribution
Probability Plot of Time Lognormal - 95% CI 99.9 Loc S c a le N AD P-Value
99 95
2.269 0 .6 84 5 100 0.432 0.299
90 t n e c r e P
80 70 60 50 40 30 20
Probability Plot of Time Lognormal - 95% CI
10 5
99.9
1
99
Lo c S c a le N A D P -Value
95
0.1
90
1 t n e c r e P
10 Time
80 70 60 50 40 30 20
2.269 0 .6 84 5 100 0.4 32 0.299
100
10 5 1
Check if LogNormal provides a good fit
0.1
1
10 Time
10 0
60 | MDT Confidential
30
Capability for Non-Normal Data using LogNormal
Process Capability of Time Calculations Based on Lognormal Distribution Model USL Process Data LSL * Target * USL 20 Sample Mean 12.31 S a mp le N 1 00 Location 2.26918 S ca le 0 .6 84 49 3
Ov erall Capability Z.Bench 1.06 Z.LSL * Z .U S L 0 .4 7 P pk 0.16 Exp. Overall Performance P PM < LS L * PPM > USL 144242 PPM Total 144242
Observed Performance P PM < LS L * PPM > USL 160000 PPM Total 160000
0
10
20
30
40
50
61 | MDT Confidential
Distribution Analysis Transformations
31
Two Options • When a dataset is non-Normal, it is acceptable either to – Mathematically transform the data to achieve Normality – Fit a non-Normal distribution
• Transformation carries the practical advantage that many statistical methods are based upon Normality, so there will be more analytical tools available for the transformed dataset. • Transformation carries the disadvantages of creating unnatural units (e.g. log-meters instead of meters) and altering potentially relevant structures of the data. • Note: Please do NOT try transform ations of data from an unstable process, or bimodal data (two bu mps).
63 | MDT Confidential
Transformation Advice • If a transformation is chosen, it should be as simple as possible, and it should ideally have a physical interpretation. • A log transformation is particularly desirable, since it – Is monotonic – Is straightforward to interpret (it turns multiplicative effects into additive effects) – Is equivalent to the LogNormal distribution – Is common in the literature
64 | MDT Confidential
32
Transformation Advice • The Johnson transformation is a last resort, as it – Rarely has any scientific/engineering meaning – Involves a complicated mathematical structure – Is not universally considered an “acceptable” transformation – Any Box-Cox transformation with a lambda value between [-2,2] is typically acceptable, although the chosen lambda should ideally have a physical meaning.
65 | MDT Confidential
Transformation Advice •
There is no transformation which will eliminate outliers! – By definition, an outlier is so far away fr om the rest of the data values that it is unlikely to belong to the same distribution. – An attribute approach is typically needed when outliers are present. – Investigate the outlier and determine if there were any typos or other unusual circumstances which would warrant deletion. – Outliers should NOT be deleted unless there is a strong argument as to why the outlier is not representative of the process. – An apparent outlier could possibly be a “typical” datapoint from a highly skewed distribution, like LogNormal or LEV. – Use engineering thinking as well as statistical thinking to decide the best course of action for outlier mitigation.
•
Stay consis tent in your choic e of transformation. Inconsistency implies an unstable process/distribution.
66 | MDT Confidential
33
Box-Cox Transformation s
(when there is no th eoretical distribution)
Assumptions
for Y
• Y > 0; Y is skewed (right or left) • Y is unimodal (single peak) Box-Cox
determines transform to make Y
normal • Y() = (Y -1) /
for 0
= loge(Y) for = 0 Use Box-Cox when there is no theoretical distribution
67 | MDT Confidential
Box-Cox Transformatio ns
(when there is no th eoretical distrib ution)
Typical
Box-Cox transformations
2 Y2 transformation
0.5
0
0.5
1
sqrt(Y) transformation
logeY transformation
1 / sqrt(Y) transformation
1 / Y transformation
Use Box-Cox when there is no theoretical distribution
68 | MDT Confidential
34
Example: Capability for Non-Normal Data using Box-Cox MINITAB®
Error Resolu tion Time.MTW
Problem Statement: Time (in days) to resolve errors in case report forms for a pre-market clinical evaluation is too long causing delay in the product release Project Goal: Decrease error resolution time. Expectation is 7 days. Project Strategy: Path Y = Resolution Time Task: Determine capability for Y = Resolution Time
69 | MDT Confidential
Example: Verify LogNormal
Probability Plot of Resolution Time Lognormal - 95% CI
Fails • 3 second rule • Fat pencil test • p-value
99.9
Loc S cale N AD P-Value
99 95
1.760 1. 303 200 3.623 <0.005
90 t n e c r e P
80 70 60 50 40 30 20 10 5 1 0.1
0.01
0.10
1.00 10.00 Resolution Time
100.00
1000.00
Not LogNormal! 70 | MDT Confidential
35
Example: Apply Box-Cox Transformation
71 | MDT Confidential
Example: Determine Optimal Lambda Box-Cox Plot of Resolution Time L ow er C L
U pp er C L Lambda
50
(using 95.0% confidence)
40
E stimate
0.26
Low er C L Upp er C L
0.15 0.38
Rounded Value
0.26
v e 30 D t S
Box-Cox transformation of Y.
20
Default = rounded value 10 Limit -1
0
1 Lambda
2
3
72 | MDT Confidential
36
Example: Calculate Capability for Non-Normal Data Using Box-Cox
73 | MDT Confidential
Example: Capability for Box-Cox Transformed Y Process Capability of Resolution Time Using Box-Cox Tra nsformation With Lambda = 0.26 USL*
transformed data
Process Data LSL * T arget * USL 7 S a m p le M e a n 1 0 .2 92 8 S am ple N 200 StDev (Within) 9.25009 StDev (O v erall) 9.5492
Within O v erall Pote ntial (Within) C apability Z.Bench -0.01 Z.LS L * Z .U S L - 0. 01 C pk -0.00 C C pk - 0. 00
A fter T ransfo rmat ion LS L* Target* U SL* Sample Mean* StDev (Within )* StD ev (O v erall)*
O v erall C apability
* * 1.65972 1.66391 0.503383 0.485671
Z.Bench Z.LS L Z .U S L Pp k C pm
0. 4 O bserv ed P erformance P P M < LS L * PPM > USL 520000.00 PPM Total 520000.00
Exp. PPM PPM PPM
0. 8
Within Performance < LS L* * > US L* 503324.68 T o t al 5 03 3 24 . 68
1. 2
1. 6
2. 0
2. 4
-0.01 * - 0. 01 -0.00 *
2. 8
Exp. O v erall Perform ance P P M < LS L* * PPM > US L* 503445.93 P P M T o t al 5 03 44 5. 9 3
Capability = Z.Bench (Potential) 74 | MDT Confidential
37
A Desirable Problem • •
If your data could be handled either through a transformation or a non-Normal distribution, either path is acceptable. All else being equal, a recommended prioritization is as follows: 1. 2. 3. 4.
Log Transformation (= LogNormal model) Weibull/Exponential model Box-Cox with lambda≠0 but lambda within [-2, 2] Other engineering distribution (SEV/LEV, logistic, etc.)
•
Any prioritization scheme should be interpreted as a heuristic, not as the one true path. • The most important thing i s to plot your data and arrive at a mathematical solution w hich makes sense within the engineering/scientific context at hand. • As mu ch as p os si bl e, remain co ns is ten t in yo ur ch oi ce o f statistical method. Avoid the “ distribution du jour” o r “ transformation du jour.”
75 | MDT Confidential
Distribution Analysis Flowchart
38
Normality Testing Flowchart: CRDM •
CRDM: Meant as a teaching aid, not an official quality doc.
MedtronicConfidential
Normality Testing Flowchart: CRDM
MedtronicConfidential
39
Normality Testing Flowchart: CRDM
MedtronicConfidential
Normality Testing Flowchart: CRDM
MedtronicConfidential
40
Distribution Analysis Summary and Challenge Problem
Objectives Recap • Explain why di stributional analysis is statistically compl icated (and sometimes emotio nally frustrating!) • Emphasize the impor tance of engineering theory and histori cal precedent. • Encourage the use of multi ple graphical methods in addition to numerical tests. • Review com mon causes of Non-Normality • Discuss Transfor mations and how they compare to fitti ng non-Normal distributions
MedtronicConfidential
41
Distribution Analysis Commentary • Distribution fitting is NOT about finding the true distribution for your data; statistical theory CANNOT prove that a particular distribution is the true model for the data. – A model is “true” if it still fits when the sample sizes approaches infinity. – With engineering data, it is often the case that distributions are approximately Normal when N=50, but taking N=200 or N=500 will show small—but statistically significant—departures from Normality. – In such a situation, the Normal distribution is often still a useful model even if it is not a “true model.”
• Instead of providing scientific truth, distribution analysis is geared toward assessing which statistical distributions are plausible models for the data at hand.
Distribution Analysis Commentary •
Good distribution fitting should combine statistical analysis with engineering/scientific thinking.
•
Even before any data are collected, engineering theory and historical precedents often suggest a distributional form: – Does the process involve any sort of maximization or minimization of physical forces? If so, then Weibull might be a good model. – Does the process involve the averaging of multiple small forces? If so, the Normal might be a good model. – Are there historical precedents which suggest which model is best?
•
Ideally, the chosen distribution should have an insignificant pvalue AND it should intuitively match with engineering principles.
42
Don’t Forget Business Context • Usually distribution analysis is just one step in a larger analytical problem. • Keep the larger business/engineering problem in mind, as it may suggest – Whether only one tail of the dataset needs to be modeled. – Whether a single low p-value might be a statistical false alarm. – Whether the model needs to produce highly precise numbers or just be “in the ballpark.”
85 | MDT Confidential
Challenge Problem • MECC Supplier Dataset: mecc_supplier.mtw • Business goal is to qualify the supplier as having high capability, and possibly to create a variables or attribute acceptance sampling plan. • LSL: 0.058 • USL: 0.064 • Analyze the data and offer your opinion of what distribution is best for the situation at hand. • What questions would you ask the Supplier Quality Engineer to help refine your decision? 86 | MDT Confidential
43
Process Capability Analysis
Objectives
• QT Review • Process Capability
2 | MDT Confidential
1
Recap from Quality Trainer
• • • • •
Introduction Process Capability for Normal Data Capability Indices Process Capability for Non-Normal Data Summary
3 | MDT Confidential
A5 Process Capability Measuring Process Capabili ty Sigma Scale, Z scores, DPM=PPM • Process Capability Indices • (Cp, Cpk, Pp, Ppk) • Impact of Normality & Process Stabili ty At tr ibut e Data Non-normal Data Minitab Assistant Impact of Sample Size (Confidence Limi ts) Comparison to Tolerance Intervals Impact of Measurement Error 4 | MDT Confidential
2
SIX SIGMA QUALITY LEVEL Customer Requirement
Histogram of Process Output 0.09
Process Capability: Comparison between what the process produ ces vs. what is required
130
0.08
Z = 6.0
0.07
Mean = 100
0.06
Std Dev = 5
Defect Rate:
6
y t i 0.05 s n e D 0.04
1 part per billion
0.03
NOTE: 2 parts per billion for two-sided specs
0.02 0.01 0.00 80
90
100
5 | MDT Confidential
110
120
130
140
X
SIX SIGMA QUALITY LEVEL Histogram of Process Output 0.09
Customer Requirement 130
To Esti mate Lon g-Term Perform ance, Ap pl y a 1.5 SHIFT IN MEAN
0.08 0.07
Mean = 107.5
0.06
Std Dev = 5
Z = 4.5
4.5
Defect Rate:
y t i 0.05 s n e D 0.04
3.4 parts per million
0.03 0.02 0.01 0.00 80 6 | MDT Confidential
90
100
110
120
130
140
X
3
SIGMA SCALE
Short-Term
Process Sigma 6.0 5.5 5.0 4.5 4.0 3.5 3.0 2.5 2.0 1.5 1.0 0.5 0.0
Long-Term
7 | MDT Confidential
z 4.5 4.0 3.5 3.0 2.5 2.0 1.5 1.0 0.5 0.0 -0.5 -1.0 -1.5
Standard Normal Tail Area Probabili ty
P(Z>z) 0.0000034 0.0000317 0.0002327 0.0013500 0.0062097 0.0227501 0.0668072 0.1586553 0.3085375 0.5000000 0.6914625 0.8413447 0.9331928
DPMO 3.4 32 233 1,350 6,210 22,750 66,807 158,655 308,538 500,000 691,462 841,345 933,193
% Conforming 99.99966 99.99683 99.9767 99.865 99.379 97.72 93.32 84.13 69.15 50.00 30.85 15.87 6.68
Process Capability Indices
Cp, Pp
Cpk, Ppk
Process Capabilit y Ratio: Variation o nly, ignores m ean
Includes mean to account for centering
Cp, Cpk use withi n subgroup variation estimate of (Short-Term, Potent ial) Pp, Ppk use overall sampl e standard deviation esti mate (Long-Term, Actual)
8 | MDT Confidential
4
Process Capabili Capabili ty Ratio Ratio Spec Width: US USL L – LSL defines allowable variation 6 defines actual process variation
Process Fallout and the process capability ratio (PCR).
9
9 | MDT Confidential Confidential
Within Subgroup Unbiasing Constants
10 | MDT Confidential Confidential
5
Simul ating Withi n Subgrou p Varia Variation tion EXERCISE
1. Randomly Sample Sample from a Known Known Population: Normal (=100, =5) Using Calc > Random Data > Normal Simulate 10,000 Subgroups of Size n=5 by placing data into C1-C5 2. Compute the the Mean (X-bar), Range Range (R), and StDev (S) for each subgroup Using Calc > Row Statistics – place into C6, C7, C8 3. Compute the Variance (S2), R/d2 and S/c4 (for n=5, d2=2.326, c4=0.939986) Using Calc > Calculator Calculator – place into C9, C10, C11 4. Evaluate the performance performance of the 6 statistics in columns C6-C11 C6-C11 in estimating population parameters Using Stat > Basic Statistics > Display Descript ive Statistics or Stat > Basic Basic Statisti cs > Graphical Summary 5. Which statistics statistics are biased and which ones are unbiased? unbiased? 11 | MDT Confidential Confidential
NOTE: Averag e (R/d2) = (R-b ar/d ar/d2) 2)
Estim ating Within Subgroup StD StDev ev
Data Source:
20 subgroup samples of 5 parts taken from a component manufacturing process. Data are coded x 0.0001 in. + 0.50 in. Applied Statistics & Probability for Engineers, 6th Edition (Montgomery &
Runger, Wiley 2013) LSL = 25 USL = 45 Target = 35
12 | MDT Confidential Confidential
6
Xbar-S Chart of Com Component ponent Data n=5 per subgroup 1
37.5 UCL=36.82
n a e 35.0 M e l p 32.5 m a S
_ _ X=33.32
30.0
LCL=29.82
1 1
1
3
5
7
9
6.0
11 Sample
13
15
17
19
1
UCL=5.116
v 4.5 e D t S e 3.0 l p m a S 1.5
_ S=2.449
0.0
LCL=0 1
3
13 | MDT Confidential Confidential
5
7
9
11 Sample
13
15
17
19
NOTE: Cente NOTE: Centerr line on S chart is calculated backward from Unbiased Pooled StDev: StDev: 2.605*0.9 2.60 5*0.939986 39986 = 2.449 2.449 for subg rou p si ze n=5
Stat > Qualit Qualit y Tool Tool s > Capabil Capabil ity An alysi alysis s > Normal . . . Process Capability of Component Data n=5 per subgroup LSL
Target
USL
Process Data LS L 25 T arget 35 U SL 45 S a mp mp le M ea ea n 3 3. 3. 32 32 S am ple N 100 StDev(Wit StDev(Withi hin) n) 2.605 2.60524 24 StDev(Ov erall) l ) 3.29946 3.29946
Within Overall Potential (Within) C apability Cp 1. 28 C PL PL 1.06 C PU P U 1. 49 C pk pk 1.06 Ov erall Capability Pp PPL P PU PU P pk C pm pm
27 O bserved Performance Performance P P M < LS LS L 0. 00 00 P P M > U SL SL 0 .0 .0 0 P P M T ot ota l 0. 00 00
30
Exp. Within Performance P P M < LS LS L 7 02 02 .6 .6 5 P PM PM > U SL SL 3. 68 P P M T ot ot al al 7 06 06 .3 .3 2
33
36
39
Exp. Ov erall Performance Performance P P M < LS LS L 5 84 84 0. 0. 77 77 P P M > U S L 200. 09 09 P P M T ot ot al al 6 04 04 0. 0. 85 85
42
1 . 01 0. 84 1. 18 0. 84 0. 90
45
PPM Estimates Ass ume PPM a Stable Process That Is Normally Distributed
14 | MDT Confidential Confidential
7
Stat > Basic Basic Statist ics > Norm Norm ality Test . . . Probability Plot of Component Data (Subgroup Data Stacked) Normal 99.9 Mean S tDev N AD P -V -V al al ue ue
99 95
33.32 3.299 100 0.620 0.620 0. 10 104
90
t n e c r e P
80 70 60 50 40 30 20 10 5 1 0.1
25
30
35
40
45
Data 15 | MDT Confidential Confidential
Statt > Quality Sta Quality Tools > Capa Capabili bili ty Sixpack > Normal . . . Process Capability Sixpack of Component Data Xbar Chart
Capability Histogram
1
LSL
n a 36 e M e l p 32 m a S
Target
USL
UCL=36.82
Specifications LS L 25 Ta arrget 35 USL 45
_ _ X=33.32 LCL=29.82
1
28 1
3
5
7
9
11
13
15
17
1
19
27
30
R Chart 16
33
36
39
42
45
Normal Prob Plot A D: 0.620, P: 0.1 04
1
e g n a R 8 e l p m a S
UCL=12.81 _ R=6.06
0
LCL=0 1
3
5
7
9
11
13
15
17
19
20
Last 20 Subgroups Subgroups
30
Within S tD ev ev 2 .6 .6 05 05 Cp 1 .28 Cpk 1 .06 PPM 7 06.32
40
s e u l a V 32 24 5
16 | MDT Confidential Confidential
10 Sample
40
Capability Plot
15
20
Within Overall
Overall S tD ev ev 3 .2 .2 99 99 Pp 1.0 1 Ppk 0.8 4 Cpm 0.9 0 PPM 60 40 40.85
Specs
NOTE: Cente NOTE: Centerr line on R chart is calculated backward from Unbiased Pooled StDev: StDev: 2.605*2.3 2.60 5*2.326 26 = 6.06 6.06 for s ubg rou p si ze n=5
8
Stat > Quality Tool Tool s > Ca Capabil pabil ity > Between/Withi Between/Withi n . . . Between/Within Capability of Component Component Data n=5 per subgroup LSL
Target
USL B/W Overall
Process Data LS L 25 T arget 35 USL 45 S am am pl ple M ea ean 3 3. 32 S am ple N 1 00 StDev(Between) StDev(Between) 2.37001 2.37001 S tD tD ev ev ( Wi Wi th th in) 2 .6 .6 05 05 24 24 S tD tD ev ev (B (B /W) 3.52197 S t D ev ev ( O v e ra ra llll ) 3 .2 .2 99 99 46 46
B/W C apability apability Cp 0 . 95 C PL PL 0.79 C PU P U 1 . 11 C pk pk 0.79 Ov erall Ca pability pability Pp PPL P PU PU P pk C pm pm
27 Observed P erformance erformance P P M < LS L 0. 00 00 P P M > US US L 0 .0 .0 0 P P M T ot ota l 0. 00 00
17 | MDT Confidential Confidential
Exp. B/W Performance P P M < LS L 9 08 08 0. 0. 54 54 P P M > U S L 4 56 56. 04 04 P P M To To ta ta l 9 53 53 6. 6. 58 58
30
33
36
39
42
1 . 01 0 . 84 1 . 18 0 . 84 0 . 90
45
Exp. Ov erall Performance Performance P P M < LS L 5 84 84 0. 0. 77 77 P P M > U S L 2 00 00. 09 09 P P M To To ta ta l 6 04 04 0. 0. 85 85
NOTE: Use Between/With in analysis when there NOTE: is significant variation between subgroups
Estim ating Within StDe tDev v f rom Indivi dual Measurements Measurements Data Source:
20 samples of individual measurements of concentration taken at one-hour intervals from a chemical process. Applied Statistics & Probability for Engineers, 6th Edition (Montgomery &
Runger, Wiley 2013) LSL = 95 USL = 105 Target = 100
18 | MDT Confidential Confidential
9
I-MR Chart of Concentration 108 UC L=105.98
e u 104 l a V l a 100 u d i v i d 96 n I
_ X=99.10
LCL=92.21
92 1
3
5
7
9
11 Observation
13
15
17
19
UC L=8.461
8
e g 6 n a R g 4 n i v o M2
__ MR=2.589
0
LCL=0 1
3
5
7
9
11 Observation
13
15
17
19
19 | MDT Confidential
Stat > Qualit y Tool s > Capabil ity An alysis > Normal . . . Process Capability of Concentration LSL
Target
USL
Process Data LS L 95 Target 100 USL 105 S a mp le M e an 9 9. 09 5 S ample N 20 S t D ev ( Wi th in ) 2 .2 95 63 StDev(Ov erall) 1.97603
Within Overall Pote ntial (Within) C apability Cp 0.73 C PL 0.59 C PU 0.86 C pk 0.59 Ov erall C apability Pp PPL P PU P pk C pm
94 Observ ed Performance P P M < LS L 5 00 00 .0 0 PPM > USL 0.00 P P M T ota l 50 00 0.0 0
96
Exp. Within Performance P P M < LS L 3 722 6. 30 P PM > U SL 5051.62 P P M T ota l 42277 .92
98
100
102
Exp. Ov erall Performance P P M < LS L 1 91 17. 21 P P M > US L 1402.63 P P M Tota l 2051 9. 84
0.84 0.69 1.00 0.69 0.76
104
PPM Estimates Assume a Stable Process That Is Normally Distributed
20 | MDT Confidential
10
Stat > Basic Statist ics > Norm ality Test . . . Probability Plot of Concentration Normal 99 Mean StDev N AD P-Value
95 90
99.10 1.976 20 0.398 0.333
80
t 70 n 60 e c r 50 e 40 P 30 20 10 5
1
95.0
97.5
100.0 Concentration
102.5
105.0
21 | MDT Confidential
Stat > Quality Tools > Capabili ty Sixpack > Normal . . . Process Capability Sixpack of Concentration I Chart
Capability Histogram LSL
UCL=105.98
e 105 u l a V l 100 a u d i v i d 95 n I
Target
USL
Specifications LSL 95 T arg et 1 00 U SL 105
_ X=99.10
LCL=92.21 1
3
5
7
9
11
13
15
17
19
94
96
98
Moving Range Chart
100
102
104
Normal Pr ob Plot A D: 0.398, P: 0.333
UCL=8.461
8
e g n a R g 4 n i v o M
__ MR=2.589
0
LCL=0 1
3
5
7
9
11
13
15
17
19
95
Last 20 Observations s 100.0 e u l a V 97.5 95.0 10 Observation
15
105
Capability Plot Within S tD ev 2 .2 96 Cp 0.73 Cpk 0.59 PPM 42277 .92
5
100
20
Within
Overall
Overall S tD ev 1 .9 76 Pp 0.84 Ppk 0.69 Cpm 0.76 P PM 2 05 19 .84
Specs
22 | MDT Confidential
11
Compare Proc ess Capabili ty Indic es to Diagnose Impro vement Ac tions Potential Capability (Inherent Variation)
Cp
Disparity Indicates Centering Issue
Disparity Indicates Stability Issue
Disparity Indicates Stability Issue
Pp
Cpk
Disparity Indicates Centering Issue
Ppk Overall Performance
23 | MDT Confidential
FOUR POSSIBILITIES (Donald J. Wheeler) Control Charts (LCL, UCL)
Is Process In Statistical Control? Yes Yes Is Process Capable of Meeting Requirements? Process Capability Indices No (Cp, Cpk, Pp, Ppk) Requires L SL, USL
No
Ideal State (Monitor)
Brink of Chaos (Remove Special Causes)
Threshold State (Alter System)
State of Chaos
24 | MDT Confidential
12
Centered, Stable, Capable Time Series Plot of A 115 110
110
105
A 100 95 90
90 1
10
20
30
40
50 Index
60
70
80
90
100
Process Capability of A LSL
USL Within Overall
Process Data LSL
90
T arget
*
USL
110
Sa mp e l M ea n
9 9. 80 45
Sample N S tD ev (W it hi n)
100 1 .4 85 39
StDev(Overall)
1.45512
Potential (Within) Capability Cp
2.24
CPL
2.20
CPU
2.29
Cpk
2.20
Overall Capability
90 Observed Performance PPM < LSL
93
Exp. Within Performance
0.00 PPM < LSL 25 | MDT Confidential
96
99
102
105
108
Exp. Overall Performance
0. 00
PPM < LSL
0. 00
PPM > U SL
0. 00
PPM > USL
0. 00
PPM > U SL
0. 00
PPM T otal
0.00
PPM T otal
0. 00
PPM Total
0. 00
Pp
2.29
PPL PPU
2.25 2.34
Ppk
2.25
Cpm
*
25
Not Centered, Stable, Potentially Capable Time Series Plot of B 115 110
B
110
105 100 95 90
90 1
10
20
30
40
50 Index
60
70
80
90
100
Process Capability of B LSL
USL Within Overall
Process Data LSL
90
Target
*
USL
110
Sa mp le M ean
1 06 .8 14
Sample N St De v( Wi th n i )
100 1 .3 60 89
StDev(Overall )
1.4326
Potential (Within) Capability Cp
2.45
CPL
4.12
CPU
0.78
Cpk
0.78
Overall Capability
90 Observed Performance
93
Exp. Within Performance
96
99
0.00
PPM < LSL
0.00
9616. 16
PPM > U SL
13080. 51
PPM T ot al
9616. 16
PPM T otal
13080. 51
PPM T otal
105
108
111
Exp. Overall Performance
PPM < LSL 26 | MDT0.00 PPM < LSL Confidential PPM > USL 200 00. 00 PPM > U SL 200 00. 00
102
Pp
2.33
PPL PPU
3.91 0.74
Ppk
0.74
Cpm
*
26
13
Centered, Stable, Not Capable Time Series Plot of C 110
110
105
C 100 95 90
90 1
10
20
30
40
50 Index
60
70
80
90
100
Process Capability of C LSL
USL Within Overall
Process Data LSL
90
T arget
*
USL
110
Sa mp e l Me an
1 00. 30 9
Sample N S tD ev (W it hi n)
100 4 .7 37 33
StDev(Overall)
4.66247
Potential (Within) Capability Cp
0.70
CPL
0.73
CPU
0.68
Cpk
0.68
Overall Capability
90 Observed Performance PPM < LSL
Exp. Within Performance
00 PPM < LSL 27 | 20000. MDT Confidential
95
100
105
110
Exp. Overall Performance
147 72. 82
PPM < LSL
13515. 67
PPM > U SL
30000. 00
PPM > U SL
203 94. 82
P PM > U SL
1 88 31 .51
PPM T otal
50000. 00
PPM T otal
35167. 63
PPM T ot al
32347. 18
Pp
0.71
PPL PPU
0.74 0.69
Ppk
0.69
Cpm
*
27
Centered, Unstable, Potentiall y Capable Time Series Plot of D 115 110
110
105
D 100 95 90
90 1
10
20
30
40
50 Index
60
70
80
90
100
Process Capability of D LSL
USL Within Overall
Process Data LSL
90
T arget
*
USL
110
Sa mp e l Me an
1 00. 10 6
Sample N S tD ev (W it hi n)
100 1 .2 30 73
StDev(Overall)
3.78355
Potential (Within) Capability Cp
2.71
CPL
2.74
CPU
2.68
Cpk
2.68
Overall Capability
90 Observed Performance
93
Exp. Within Performance
96
99
0.00
PPM < LSL
3782. 21
0. 00
PPM > U SL
4459. 79
PPM Total
0.00
PPM T otal
8242. 00
PPM T otal
105
108
Exp. Overall Performance
PPM < LSL 28 | MDT 0. 00 Confidential PPM < LSL PPM > U SL 0. 00 PPM > USL 0.00
102
Pp
0.88
PPL PPU
0.89 0.87
Ppk
0.87
Cpm
*
28
14
Critical Thinking o f Data & Analysis is Required for Valid Inferences
DATA
+
CONDITIONS How was it collected? At a s in gl e po in t i n time, or over multiple time points? Were all sources of variation acting during the data collection timeframe?
ANALYSIS STATISTICS Control Charts: Is variation stable over time?
INFERENCE PREDICTION Future Performance
Process Capabilit y Indices: Cp, Cpk Pp, Ppk Within Overall Short-Term Long-Term
QUALIFICATION STUDY: Data collected at a single time point over limited
conditions. Therefore, control charts and Ppk may not reflect long-term performance since the analysis was computed from a short-term data set. Recommend using study sample size as the subgroup size for capability analysis. 29 | MDT Confidential
How to Evaluate Proc ess Capabili ty Stabili ty: Compare Cpk to Ppk or Cp to Pp Centering: Compare Cp to Cpk or Pp to Ppk Variatio n: Compare Cp, Pp to 1.0 World Class Perfo rmance: Cpk > 2, Ppk > 1.5
How to Improve Process Capability (1) Make Process Stable (2) Center Proc ess Mean (3) Reduce Process Variatio n (4)* Widen Specification Limi ts * What is required for opt ion #4? What quality system requi rement exists to assure that option #4 is done w ith scientifically sound r ationale? 30 | MDT Confidential
15
Quality Improvement Process
31 | MDT Confidential
Attribute Data WHAT IS Z .
W h a t is Z ?
T e lls h o w
c a p a b l e Y i s r e l a t iv e t o s p e c s Z
D P M O
6
3 .4
5
2 33
4
6,210
3
66,807
2
308,537
1
691,462
DPMO = defects per million opportun ities Opportunities = Number of Units* Opportuniti es per Unit ( to have a defect) Defects = number of observed defects in the Number of Units Att ri but e Capabi lit y Measu res 1)A defect rate or a defective rate (they are the same if there is only one opportunity per uni t for a defect - in this case a defective unit has only one defect ) 2)DPMO 3)Z
32 | MDT Confidential
16
Attribute Data ROADMAP FOR CAPABILITY .
Capability Roadmap What Type of Data Do You Have ? At tr ib ut e Dat a
Variables Data
MINITAB: Stat > Quality Tools > Capability Analysis > Normal
Z.st
Z Bench Potential (Within)
For Attribu te Data: Can use Minitab: Stat>Quality Tools>Capability Analysis>Bin omial (for defective units or on e defect opportu nity per unit) Warnin g: you h ave to manually add 1.5 to Z from Mini tab to get Z.st from t he Six Sigma Project Guid e. 33 | MDT Confidential
Attribute Data ATTRIBUTE PROCESS CAPABILITY Opps per unit is the number of opportunities per unit to have this particular defect. A uni t may have m ore t han o ne op por tun it y t o hav e a spec ifi c def ect. (It is conservative to assume only 1 opportunity of a defect per unit) The Six Sigma Project Guide is used to carry out the c apability calculations. The icon for this Guide looks like this:
Open up this Project Guide and click on init ial capability icon
Note: Z.ST stands for Z short term whic h is a common measure to use in Six Sigma.
34 | MDT Confidential
17
Attribute Data Example: Capabili ty for Attrib ute Data Project Goal: Improve Freestyle first pass yield from 10% to 20%.
50 Freestyles inspected 44 defective What is initial capability?
Defects
Opps per Uni t
Uni ts
44 NA
1 NA
50 NA
Z.ST
Z.ST 95% Upper
Z.ST 95% Lower
0.33
0.80
-0.19
Ini tial Capability Final Capabil ity
Ini tial Capability
Based on n=50, we are 95% confident: Z.ST < 0.80, Z.St > -0.19
35 | MDT Confidential
Attribute Data Example: Capability f or Att ribut e Data Graph of Initial vs Final Capability (good way to present capability!)
Capability for Attribute Y
Project Goal (%Defective)
80.000%
Defects
Opps per Uni t
Uni ts
Total Opps
DPMO
44 80
1 1
50 100
50 100
880000 800000
Z.ST
Z.ST 95% Upper
Z.ST 95% Lower
Project Goal Z.ST
Initial Capability
0.33
0.80
-0.19
0.658
Final Capability
0.66
0.95
0.36
0.658
Initial Capability Final Capability
Final Capability: # Units is arbitrary (since we don’t have any final data yet) # Defects = # Units * Project Goal % = 100 * 80% = 80 (assumes goal is met)
36 | MDT Confidential
18
Attribute Data Att ri bu te c apab il ity can be exp ressed as: -a proportion defective with a co nfidence interval -a Z with a confidence interval Example: Capability for Attribute Data (cont) 100%
Initial vs Final Capability %Defective with 95%Confidence Bounds
% Defective is a good way to explain capability
ProjectGoal
80% e v i t 60% c e f e D % 40%
6
Initial vs Final Capability Z.ST with 95% Confidence Bou nds ProjectGoal
20%
5 0%
4 In it ial Capabi li ty
Fi nal Capabi li ty
T S . 3 Z
2
But in Lean Sigma they like Z
1 0 In itial C apabil ity
Fi nal C apabili ty
37 | MDT Confidential
Attribute Data For baseline capability: 44/50 defective units (one opportunity per unit) Inputs: Minitab: Stat>Quality Tools>Capability Analysis>Bino mial
38 | MDT Confidential
19
Attribute Data For baseline capability: 44/50 defective units (one opportunity per unit) Outputs: Binomial Process Capability Analysis of Defectives P Chart 1.0
s e v 47.5 i t c e f e 45.0 D d e 42.5 t c e p 40.0 x E
UCL=1
n o i t r 0.9 o p o r P 0.8
_ P=0.88
LCL=0.7421
Binomial Plot
40 45 50 Observed Defectives
1 Sample Cumulative %Defective
Histogram Summary Stats
95
1.00
(95.0% confidence)
e 90 v i t c e f e 85 D % 80 75 0.98
0.99
1.00 Sample
1.01
1.02
% De fectiv e: Low er C I: U pper C I: Target: P P M Def: Low er C I:
88.00 75.69 95.47 0.00 880000 756899
U pper C I: P rocess Z: Low er C I: U pper C I:
954665 -1.1750 -1.6919 -0.6964
0.75 y c n e u 0.50 q e r F 0.25 0.00
88 %Defective
Note: Add 1.5 to Minitab Z outputs to get the Z.st & CI Z.st for baseline 44/50. 39 | MDT Confidential
Attribute Data .
Exercise: Capabil ity fo r Attri bute Data Problem Statement: Expense reporting first pass yield is too low. Project Goal: Improve first pass yield from 70% to 85%.
Submitted Reports Defects Opportunities per Report
200 61 1
Task: Determine submitted reports capability Appr oach: Work alone or in small groups.
40 | MDT Confidential
20
At tr ibute Data: Manufact urin g Yield First -Pass Yield (%) by Operational Step FPY = (Good / Attempts)*100 Att empts
OP 10
PRB
Scrap
Good
Rew ork
Rolled-Throughput Yield (%) by Produ ct
41 | MDT Confidential
RTY =
FPYi ) = FPY1 x FPY2 x FPY3 x. . .
Individuals Control Chart Individuals Control Chart of Dail y FPY 100.0 97.5
_ X=96.55
) 95.0 % ( Y P F 92.5 y l i a D
LCL=90.70
90.0 87.5 85.0
1
7
13
19
25
31 Day
37
43
49
55
Purpose: (1) fro m baseline data, determine threshold limits for prospective monitori ng 42 | MDT Confidential
21
Individuals Control Chart Individuals Control Chart of Dail y FPY 100.0 97.5
_ X=96.55
) 95.0 % ( Y P F 92.5 y l i a D
LCL=90.70
90.0
87.5 NOTE: A statisti cally stable proc ess is in contr ol, displaying a consi stent pattern of variation over time. The variation exhibi ted by a stable process is 85.0 considered to b e due to chance or common causes that are 1 7 13 19 25 31 37 43 49 55 inherent to the design of the systemDay (produ ct and process). Therefore, a stable process is operating to its full potential by design. If we desire better perfo rmance (incr ease mean FPY, or redu ce variation), then a change to the system is required. What type of changes may be effective? Who is responsible for excecuting changes to the system? 43 | MDT Confidential
Individuals Control Chart Purpose: (2) quanti fy pro cess stabili ty by comparing two estimates of variation: Lon g-Term: Sample Standard Deviati on Short -Term: Average Movi ng Range / 1.128 Stability Index = Long-Term
/ Shor t-Term
Process i s Unstabl e When Stability Ind ex > > 1.0
44 | MDT Confidential
22
Example A I-MR Chart of Daily FPY 100 _ X=96.51
) % ( 95 Y P F y l i a 90 D
LCL=90.58
85 1
7
13
19
25
31 Day
37
43
49
55
8 UCL=7.288
e 6 g n a R4 g n i v o M2
__ MR=2.231
0
LCL=0 1
7
13
19
25
31 Day
37
43
49
55
I Chart (Long -Term): S = 2.013 MR Chart (Short -Term ): S = 2.231 / 1.128 = 1.98 Stabili ty Index = 2.013 / 1.978 = 1.02 45 | MDT Confidential
Example B I Chart of Daily FPY _ X=97.0
100
LCL=82.6
80
) % ( 60 Y P F y l i a 40 D
1
20
0
UB=0
1
1
7
13
19
25
31 Day
37
43
49
55
I Chart (Long -Term): S = 13.45 MR Chart (Short -Term ): S = 5.4 / 1.128 = 4.79 Stabili ty Index = 13.45 / 4.79 = 2.81 46 | MDT Confidential
23
Example B I Chart of Daily FPY _ X=97.0
100
LCL=82.6
80
) % ( 60 Y P F y l i a 40 D
1
20
0
1
1
7
13
19
25
31 Day
37
NOTE: Variation that exceeds statistical control li mits should be treated as due to the presence of a special cause; local action should be taken to investigate, determine root cause, and prevent reoccurrences. UB=0 43
49
55
I Chart (Long -Term): S = 13.45 MR Chart (Short -Term ): S = 5.4 / 1.128 = 4.79 Stabili ty Index = 13.45 / 4.79 = 2.81 47 | MDT Confidential
Example C I Chart of Daily FPY _ X=99.32
100
LCL=97.40 95
) % ( 90 Y P F y l i a D 85 1
80
1
75 1
7
13
19
25
31 Day
37
43
49
55
I Chart (Long -Term): S = 3.627 MR Chart (Short -Term ): S = 0.72 / 1.128 = 0.638 Stabili ty Index = 3.627 / 0.638 = 5.68 48 | MDT Confidential
24
Macro View of FPY by Op AVERAGE FPY vs. STABILITY INDEX 100
C
99 Y P F E 98 G A R E V A 97
B
A 96
1
2
49 | MDT Confidential
3
4
5
6
STA BILITY INDEX
Improvement Strategy AVERAGE FPY vs. STABILITY INDEX Capable But Periodically Unstable
100
Identify & Remove Special Causes Daily: MTM/Supervisors
99 Y P F E 98 G A R E V A 97
C
Stable but Chronically Less Capable Change System (Projects)
A
B
Monthly, Quarterly: Ops Mgmt, Engr
96
1 50 | MDT Confidential
2
3
4
5
6
STA BILITY INDEX
25
Non-normal Data Dataset: DISTSKEW.MTW Variables: Pos Skew (column B) Objective: Determine Cpk with Specs: 5-50 Pathway: Stat/Basic Statistics/Graphical Summary Inputs: select variable Pos Skew t o analyze Is this data normally distributed? Pathway: Graph/Probability Plot (Test for Normality, default option) Inputs: select variable Pos Skew to analyze Two plot layout: Right click on folder icon on toolbar to left of “ i” toolbar symbol Hold down control key and left click on two graph names, right click on the graph names to get layout tool and click on finish. Layout tool results:
Can we compute Cpk? 51 | MDT Confidential
Non-normal Data CPK FOR NON-NORMAL DISTRIBUTION Dataset: DISTSKEW.MTW Variables: Pos Skew (column B) Box/Cox transformation :Pathway: Stat/Control Charts/Box Cox Inputs: all obs i n one column/ select variable ‘Pos Skew” /Subgroup Size 1 Johnson Transformation: Pathway: Stat/Quality Tools/Johnson Transformation Inputs: select variable Pos Skew to analyze Merged layout:
= 0.0
52 | MDT Confidential
26
Non-normal Data BOX- COX TRANSFORMATION BOX COX Table of Transformations ______________________________________________________________________
Transformation ______________________________________________________________________ 1 1/2 0 -1/2 -1
No transformation Square root Log Reciprocal Square Root Reciprocal
Example of Minitab Box-Cox Input Screen with Lambda=0
53 | MDT Confidential
Non-normal Data CPK WITH TRANSFORMED DATA What is the Cpk for DistSkew Data Set? Pathway: Stat/Quality tools/Capability Analysis/Normal Inputs: select variable “Pos skew” , subgroup size 1, LSL=5,USL=50 AND cl ick on B ox-Co x bu tt on an d sel ect “ Use op ti mal l ambd a”
Recall = 0 is the log transformation of th e data. Cpk= __________.
54 | MDT Confidential
27
Non-normal Data CAPABILITY WITH TRANSFORMED DATA Capability SixPack Pathway: Stat/Quality tools/Capability Sixpack/Normal Inputs: select variable “Pos skew” , subgroup size 1, LSL=5,USL=50 AND cl ic k on Box -Cox b utt on an d s elect “ Use op ti mal l amb da”
55 | MDT Confidential
Non-normal Data CAPABILITY WITH RAW DATA What is the Capability fo r DistSkew Data Set? Pathway: Stat/Quality tools/Capability Analysis/Nonnormal Inputs: select variable “Pos skew” , subgroup size 1, LSL=5,USL=50 AND cl ic k sel ect t he rad io b ut ton dis tr ibu ti on w ith pul l do wn o f l ogn or mal Output:
56 | MDT Confidential
28
Non-normal Data Capability Normal Branch with Box-Cox vs Log-Normal
1) the ppm “Observed” stay the same when you fit the log-normal using either the normal or non-normal capability branch. Actually, the ppm observed will stay the same no what distribution you fit to the data. 2) the ppm “Expected Overall” stays the same when you fit the log-normal using either the normal or non-normal capability branch. 3) The Ppks can be very different between using the “capability normal” branch (with the Box-Cox transform) vs using the “capability nonnormal” (using lognormal fit) because the “ capability nonnormal” branch uses the ISO definit ion of Ppk
4) The “capability nonnormal” has no Cpk. Just Ppk. And it has no confidence interval for Ppk either.
57 | MDT Confidential
Minitab Assistant vs Method Chooser Minitab Method Chooser Flowchart
58 | MDT Confidential
29
Minitab Assistant Flow Chart
59 | MDT Confidential
Minitab Assistant: Continuous Data • Minitab Assistant wants 100 or more data points. • Minitab tests (AD) for normality at the .05 level. • Minitab Assistant uses THREE rules to check for stability of the process: – –
Test 1: Point out side control limits Test 2 : Nine points in a row on the same side of the centerline – Test 7(Modified):12-15 points within one sigma of the centerline
• Minitab Assistant info: http://www.minitab.com/enCN/support/answers/answer.aspx?ID=2613&langT ype=1033
60 | MDT Confidential
30
Minitab Assistant
61 | MDT Confidential
Minitab Assistant – Normal Distribution • Use CABLE.MTW • Dataset has 100 measurements of the diameter of a cable wire – 20 hourly samples of n=5 • The engineering specification for this diameter is 0.55 +/- 0.05 cm. • Our task is to conduct a capability analysis of this process.
62 | MDT Confidential
31
Minitab Assistant – Normal Distribution
63 | MDT Confidential
Minitab Assistant – Normal Distribution Capability Analysis for Diameter Report Card C he ck
S ta tu s
S t ab ilti y
Capability Analysis for Diameter Process Performance Report
D es cr ip ti on T h e pr o ce s s m ea n a nd v a ri at io n a r e s ta bel . N o p oint s a re o u t of c o nt r ol.
Number of Subgroups
i
Normality
CapabilityHistogram Are the data ni side the ilmits?
You only have 20 subgroups. Fora capability analysis, it si generally recommendedthat youcollect at least 25 subgroups overa long enoughperiod of time to capture the different sources of process variation.
ProcessCharacterization
LSL
USL
Your data passed the normality test. As ol ngas you have enoughdata, the capability estimates should be reasonably accurate.
Amount of Data
The total numberof observations is 100 or more. The capability estimates shouldbe reasonably precise.
Total N Subgroup size Mean StDev (overall) StDev (within)
100 5 0.54646 0.019341 0.018548
CapabilityStatistics
Capability Analysis for Diameter Diagnostic Report 0.50
Xbar-RChart Confirm that theprocess is stable.
0.52
0.54
0.56
0.58
0.60
Actual(overall) Pp P pk Z.Bench % Out of spec (observed) % Out of spec (ex pec e t d) PPM (DPMO) (observ ed) PPM (DPMO) (expect ed) Potential(within) Cp Cpk Z.Bench % Out of spec (ex pec e t d) PPM (DPMO) (expected)
0.86 0.80 2.29 2.00 1.10 20000 10969 0.90 0.83 2.41 0.81 8072
Actual(overall)capability iswhat thecustomerexperiences.
0.56
Potential(within)capabilityiswhatcould beachievedif process shiftsand driftswereeliminated.
n a e M 0.54
Capability Analysis for Diameter Summary Report
0.52 0.10
Customer Requirements How capable is the process? 0
e g n 0.05 a R
6
Low
High
Upper Spec Target Lower Spec
0.6 * 0.5 ProcessC haracterization
Z.Bench = 2.29
Mean Standard d ev iation
0.00 1
3
5
7
9
11
13
15
17
19 Actual(overall) Capability Are the data insidethe limits? LSL
NormalityPlot Thepointsshould becloseto theline.
USL
0.86 0.80 2.29 1.10 10969
Comments
NormalityTest
Conclusions -- The defect rateis 1.10%,which estimates the percentage ofparts from the process that are outsidethe speclimits.
(Anderson-Darling) Results P-value
0.54646 0.019341
Actual(overall)capability Pp P pk Z .Bench % Out of spec PPM (DPMO)
Pass 0.794
Actual(overall) capability is what the customer experiences.
0.50
0.52
0.54
0.56
0.58
0.60
64 | MDT Confidential
32
Minitab Assistant: Non-Normal Data
• Use TILES.MTW • Choose Minitab Assistant – Capability Analysis • Detects non-normality and offers the option of transfomation (Box-Cox)
65 | MDT Confidential
Minitab Assistant: Non-Normal Data
Capability Analysis for Warping Report Card Check
S tatus
Stabilit y Number of Subgroups Normality Amount of Data
Des cr ipti on The process mean and varia tio n are stable . No poin ts are out of control.
i
You only have 10 subgroups. For a capability analysis, it is generally recommended that you collect at least 25 subgroups over a long enough period of time to capture the different sources of process variation. The transformed data passed the normality test. As long as you have enough data, the capability estimates should be reasonably accurate. The total number of observations is 100 or more. The capability estimates should be reasonably precise.
66 | MDT Confidential
33
Minitab Assistant: Non-Normal Data Capability Analysis for Warping Diagnostic Report
Capability Analysis for Warping Summary Report
Xbar-S Chart Confirm that the process is stable.
CustomerRequirements
4
n a 3 e M
Howcapableis the process? 0
2
6
Low
3
High
Upper Spec Target Lower Spec
v 2 e D t S
8 * * Process Characterization
Z.Bench = 2.24
Mean Standard dev iation
1 1
2
3
4
5
6
7
8
9
NormalityPlot (lambda= 0.50) The points should beclose to the line.
Actual(overall) Capability Are the databelow the ilmit?
NormalityTest (Anderson-Darling) Orig inal
Transformed
Fail 0.010
Pass 0.574
Results P-value
2.9231 1.7860
10
USL
Actual(overall) capability Pp P pk Z .Bench % Out of spec PPM (DPMO)
* 0.75 2.24 1.26 12569
Comments Conclusions -- Thedefect rate is 1.26%, which estimates the percentage ofparts from the process that areoutside the speclimits.
Capability Analysis for Warping Process Performance Report
CapabilityHistogram Are thedata belowthe limit?
Actual(overall) capability is what thecusto mer experiences.
ProcessCharacterization USL
Total N Subgroup size
100 10 0.0
CapabilityStatistics Actual(overall) Pp P pk Z .Bench % Out of spec (observed) % Out of spec (expected) PPM (DPMO) (observed) PPM (DPMO) (ex pected) Potential(within) Cp C pk Z .Bench % Out of spec (expected) PPM (DPMO) (ex pected) 0.0
1.5
3.0
4.5
6.0
7.5
1.5
3.0
4.5
6.0
7.5
* 0.75 2.24 2.00 1.26 20000 12569 * 0.76 2.28 1.12 11249
TransformedData
Actual(overall)capability is what thecustomer experiences. Potential(within)capability is what could be achieved ifprocess shifts and drifts were eliminated.
67 | MDT Confidential
Confidence Limits: NOT IN ASSISTANT • Stat -> Quality Tools -> Capability Analysis -> Normal – Follow this path if you need to calculate the Lower Confidence Bound on Cpk or Ppk Note: The Normal branch has Box-Cox transformations (for non-normal data) that allows you to get Cpk and Ppk and confidence intervals for Cpk & Ppk on the transformed scale. Note: There is no Cpk or a confidence interval for Ppk if you use the Non-Normal branch. Note: The Minitab Assistant DOES NOT give confidence limits for Capability indicies. It allows you to use the Box-Cox transform when it detects non-normal data.
.
68 | MDT Confidential
34
Confidence Limits: Normal Case
Select one-sided lower limit 69 | MDT Confidential
Confidence Limits: Normal Case Process Capability of Diameter (using 95.0% confidence) LSL
USL Within Overall
Process Data LSL 0.5 Target * U SL 0.6 S a mp le M e a n 0 .5 46 46 Sample N 100 StDev(Wit hin ) 0.0185477 StDev(Overall) 0.0193414
Potential (Within) Ca pability Cp 0.90 Lo w er C L 0 .7 8 C PL 0.83 C PU 0.96 C pk 0.83 Lo w er C L 0 .7 1
Cpk and 95% Lower confidence limit for Cpk
Ov erall Capability
0.50 Observed Performance P P M < L S L 1 00 00 .0 0 PPM > USL 10000.00 P P M To ta l 2 00 00 .0 0
0.52
Exp. Within Performance P P M < LS L 6 12 4. 50 PPM > USL 1947.11 P P M To ta l 8 07 1. 61
0.54
0.56
0.58
Exp. O verall Performance P P M < LS L 8 150 .5 7 P P M > U SL 2 818 .7 1 P P M T ot al 1 09 69 .2 8
0.60
Pp Lo w er C L PPL PPU Ppk Lo w er C L C pm Low er C L
0.86 0 .7 6 0.80 0.92 0.80 0 .6 9 * *
Ppk and 95% Lower confidence limit for Ppk
70 | MDT Confidential
35
Confidence Lim its: Norm al Case LOWER 95% CONFIDENCE FOR OBSERVED CPK Obs Cpk
10
20
30
40
50
75
100
150
200
0.5
0.24
0.32
0.35
0.37
0.39
0.41
0.42
0.43
0.44
0.6
0.31
0.40
0.44
0.46
Sample Size (n)
0.47
0.50
0.51
0.53
0.54
0.7 0.8
0.38 0.44
0.48 0.55
0.52 0.60
0.54 0.63
0.56 0.65
0.59 0.67
0.60 0.69
0.62 0.71
0.63 0.72
0.9 1.0
0.51 0.58
0.63 0.71
0.68 0.76
0.71 0.79
0.73 0.82
0.76 0.85
0.78 0.87
0.80 0.89
0.82 0.91
1.1
0.64
0.78
0.84
0.88
0.90
0.94
0.96
0.99
1.00
1.2
0.70
0.86
0.92
0.96
0.99
1.03
1.05
1.08
1.09
1.3
0.77
0.93
1.00
1.04
1.07
1.11
1.14
1.17
1.19
1.4
0.83
1.01
1.08
1.13
1.15
1.20
1.23
1.26
1.28
1.5 1.6
0.89 0.96
1.08 1.16
1.16 1.24
1.21 1.29
1.24 1.32
1.29 1.37
1.32 1.41
1.35 1.44
1.37 1.46
1.7
1.02
1.23
1.32
1.37
1.41
1.46
1.49
1.53
1.55
1.8
1.08
1.30
1.40
1.45
1.49
1.55
1.58
1.62
1.65
1.9 2.0
1.14 1.21
1.38 1.45
1.48 1.56
1.54 1.62
1.57 1.66
1.64 1.72
1.67 1.76
1.71 1.80
1.74 1.83
2.1 2.2
1.27 1.33
1.53 1.60
1.64 1.71
1.70 1.78
1.74 1.83
1.81 1.90
1.85 1.94
1.89 1.99
1.92 2.01
2.3
1.39
1.67
1.79
1.86
1.91
1.98
2.03
2.08
2.11
2.4 2.5
1.45 1.52
1.75 1.82
1.87 1.95
1.94 2.03
1.99 2.08
2.07 2.16
2.11 2.20
2.17 2.26
2.20 2.29
2.6
1.58
1.90
2.03
2.11
2.16
2.24
2.29
2.35
2.38
2.7
1.64
1.97
2.11
2.19
2.24
2.33
2.38
2.44
2.47
2.8 2.9
1.70 1.76
2.04 2.12
2.19 2.27
2.27 2.35
2.33 2.41
2.42 2.50
2.47 2.56
2.53 2.62
2.57 2.66
3.0
1.82
2.19
2.34
2.43
2.50
2.59
2.65
2.71
2.75
3.1
1.89
2.26
2.42
2.52
2.58
2.68
2.73
2.80
2.84
3.2
1.95
2.34
2.50
2.60
2.66
2.76
2.82
2.89
2.93
3.3 3.4
2.01 2.07
2.41 2.48
2.58 2.66
2.68 2.76
2.75 2.83
2.85 2.94
2.91 3.00
2.98 3.07
3.03 3.12
3.5
2.13
2.56
2.74
2.84
2.91
3.02
3.09
3.16
3.21
3.6 3.7
2.19 2.25
2.63 2.71
2.82 2.89
2.92 3.01
3.00 3.08
3.11 3.20
3.18 3.26
3.25 3.34
3.30 3.39
3.8
2.32
2.78
2.97
3.09
3.16
3.28
3.35
3.44
3.48
3.9
2.38
2.85
3.05
3.17
3.25
3.37
3.44
3.53
3.58
4.0
2.44
2.93
3.13
3.25
3.33
3.46
3.53
3.62
3.67
71 | MDT Confidential
EXERCISE: Confi dence Bound for Cpk Simulation Study of Cpk :
1)Simulate 10,000 rows with 5 columns of a normal distribution with mean = 10
and std . dev. = 1.
2) Compute the mean and standard deviation for each row. 3) Use LSL= 7, USL = 13. and compute Cpl & Cpu for each row. 4) Take the min of Cpl & Cpu to get Cpk for each row. 5) Make a histogram of the simulated Cpks. Does the distribution of simulated Cpks look normal?. What should the theoretical Cpk be from the mean, standard deviation and specs? What is the distribution of Cpk lower bounds? How often does the lower confidence bound contain the “true” value for Cpk?
72 | MDT Confidential
36
Simulation Results for n=5 Summary for Sample Cpk Estimated from n=5 Mu=10, Sigma=1, LSL=7, USL=13, Population Cpk = 1.0 A nderson-Darling Norma lity Test
1.1
2.2
3.3
4.4
5.5
6.6
A -Squared P -V alue <
436.22 0.005
M ean StDev V ariance S kew ness Kurtosis N
1.1036 0.5573 0.3106 2.7486 13.4533 10000
M inimum 1 st Q u ar ti le M edian 3 rd Q u a rt il e M aximum
7.7
0.3191 0 .7 54 2 0.9676 1 .2 81 6 7.8123
95% C onfidence Interval for Mean 1.0926
1.1145
95% C onfidence Interval for Median 0.9589 95% Confidence Intervals
0.9761
95% C onfidence Interval for StDev 0.5497
Mean
0.5651
Median 0.950
0.975
1.000
1.025
1.050
1.075
1.100
In theo ry, Cpk = min ( (13 – 10)/(3*1) , (10-7)/(3*1) ) = 1.000 73 | MDT Confidential
Simulation Results for n=5, 10, 20, 30, 50, 100 Histogram of Cpk 5
Histogram of Cpk 10
Normal
Normal
1000
M ea n
1 .1 04
StDev N
0.5573 10000
y c n e u q e r 500 F
1600
M ea n
10 .0 7
S tD e v N
0 .2 85 3 10000
y c n e u q 800 e r F
0
0 0.0
1.1
2.2
3.3 4.4 Cpk 5
5.5
6.6
7.7
0.7
1.4
2.1
Histogram of Cpk 20
2.8 Cpk 10
3.5
4.2
4.9
Histogram of Cpk 30
Normal
Normal M ea n
0 .9 80 2
StDev N
0.1763 10000
400
y c n e u q e 200 r F
500
M e an
0 .9 77 0
S tD e v N
0 .1 41 1 10000
y c n e u q e 250 r F
0
0 0.66
0.88
1.10
1.32 Cpk 20
1.54
1.76
1.98
0.64
0.80
0.96
Histogram of Cpk 50
1.12 Cpk 30
1.28
1.44
1.60
Histogram of Cpk 100
Normal
Normal M ea n StDev N
500
y c n e u q e r 250 F
0 .9 78 8 0.1053 10000
M ea n S tD e v N
500
0. 982 5 0 .0 73 76 10000
y c n e u q 250 e r F
0
0 0.72
0.84
0.96
1.08 Cpk 50
1.20
1.32
1.44
0.72
0.80
0.88
0.96 1.04 Cpk 100
1.12
1.20
1.28
74 | MDT Confidential
37
Simulation Results for n=5, 10, 20, 30, 50, 100 Boxplot of Sample Cpk Estimates by Sample Size 8
Ass umpt io ns : Normal = 10 =1 LSL = 7 USL = 7 True Cpk = 1.0
7 6 5
a t a 4 D 3 2 1
1
0 Cpk 5
Cpk 10
Cpk 15
Cpk 20
Cpk 25
Cpk 30
Cpk 50
Cpk 100
75 | MDT Confidential
Simulation Results for n=5, 10, 20, 30, 50, 100 Boxplot of Cpk Lower Bounds by Sample Size 3.5
Assum pt io ns: Normal = 10 =1 LSL = 7 USL = 7 True Cpk = 1.0
3.0 2.5 2.0
a t a D 1.5 1.0
1
0.5 0.0 Cpk LB 5
Cpk LB 10
Cpk LB 15
Cpk LB 20
Cpk LB 25
Cpk LB 30
Cpk LB 50 Cpk LB 100
76 | MDT Confidential
38
Performance of Cpk Lower Confidence Bound 95% Lower Confidence Bound for Cpk Miss Rate vs. Population Mean LSL 7
(Population Sigma Varies to Make Cpk=1.0) 10 N 5 10 15 20 25 30 50 100
6.00% 5.00%
5.00%
4.00%
e t a R 3.00% s s i M 2.00% 1.00%
Target Nominal
0.00%
USL=13 7.0
7.5
8.0 8.5 9.0 True Population Mean
9.5
10.0
77 | MDT Confidential
Confidence Lim its: Norm al Case LOWER 95% CONFIDENCE FOR OBSERVED CPK
Simulation Results • Formula is co nservative when pr ocess mean i s on target (better than 95% coverage of true Cpk value).
• As process mean deviates fr om target, formula pro vides approximately th e stated reliabil ity in perfor mance (95%), regardless of s ample size. 78 | MDT Confidential
39
Relations hip Between Cpk & Tolerance Intervals (Confi dence/Reliabili ty Levels)
79 | MDT Confidential
At tr ibute Sample Sizes Usi ng c=0 Plan s NUMBER OF TESTS WITHOUT FAILURE VS RELIABILITY AND CONFIDENCE Confidence Level (%) 80 85 90
Reliability
50
60
70
75
95
97.5
99
99.5
99.9
0.999999
693147
916291
1203973
1386294
1609438
1897120
2302584
2995731
3688878
4605168
5298315
6907752
0 .99 99 9
69 315
91 62 9
1 20 39 7
13 8629
1 609 43
1 89 71 2
23 02 58
2 99 572
3 68 88 7
4 605 15
52 983 0
6 90 77 3
0. 9999
6932
9163
12040
13863
16094
18971
23025
29956
36887
46050
52981
69075
0.999
693
916
1204
1386
1609
1897
2302
2995
3688
4603
5296
6905
0.998
347
458
602
693
804
948
1151
1497
1843
2301
2647
3451
0.997
231
305
401
462
536
632
767
998
1228
1533
1764
2300
0.996
173
229
301
346
402
474
575
748
921
1149
1322
1724
0.995
139
183
241
277
322
379
460
598
736
919
1058
1379
0.994
116
153
201
231
268
316
383
498
613
766
881
1148
0.993
99
131
172
198
230
271
328
427
526
656
755
984
0.992
87
115
150
173
201
237
287
373
460
574
660
861
0.991
77
102
134
154
179
210
255
332
409
510
587
765
0.99
69
92
120
138
161
189
230
299
368
459
528
688
0.98
35
46
60
69
80
94
114
149
183
228
263
342
0.97
23
31
40
46
53
63
76
99
122
152
174
227
0.96
17
23
30
34
40
47
57
74
91
113
130
170
0.95
14
18
24
28
32
37
45
59
72
90
104
135
0.94
12
15
20
23
27
31
38
49
60
75
86
112
0.93
10
13
17
20
23
27
32
42
51
64
74
96
0.92
9
11
15
17
20
23
28
36
45
56
64
83
0.91
8
10
13
15
18
21
25
32
40
49
57
74
0.9
7
9
12
14
16
19
22
29
36
44
51
66
0.8
4
5
6
7
8
9
11
14
17
21
24
31
0.7
2
3
4
4
5
6
7
9
11
13
15
20
0.6
2
2
3
3
4
4
5
6
8
10
11
14
1
2
2
3
3
3
4
5
6
7
8
10
0.5
80 | MDT Confidential
40
Producer ’s Risk Of Using c=0 Plans Sample Size Required
Reliability Level Required to Pass
0 Failures Allowed Reliability
Confidence 90 95
50% Chance 95% Chance Confidence 90 95
Reliability
Confidence 90 95
0.999
2302
2995
0.999
0.99970
0.99977
0.99998
0.99998
0.997
767
998
0.997
0.99910
0.99931
0.99993
0.99995
0.99
230
299
0.99
0.9970
0.9977
0.9997
0.9998
0.95
45
59
0.95
0.9847
0.9883
0.9989
0.9991
0.90
22
29
0.90
0.9690
0.9764
0.9977
0.9982
0.80
11
14
0.80
0.9389
0.9517
0.9950
0.9963
NOTE: The term reliability in a compliance testing context refers to conformance to design requirements, not to the actual device field perfor mance level. The difference is due to unaccount ed design margin between spec limits and the variation required to degrade field performance. c=0 Plans Maximize The Chances Of A Good Pro cess Failing The Study 81 | MDT Confidential
The Big Picture of Compliance Testing BEFORE Characterization Studies
• Inject sources of variation to stress system
• Experimentation (DOE) • Simulation Modeling • Measure design margin
DURING
AFTER
Qualification
Process
Studies • n delivers co nf%/rel%
Stability
• All so ur ces o f
• Limited conditions
variation w ill be acting over the long term in the future
• Representative
• Need to detect
sample?
significant changes
• One time point
• Optimization 82 | MDT Confidential
41
How To Move Away From A Compliance Testing Culture Toward A Capability Culture Identify Critical Requirements 1. Perform thorough characterization studies; inject sourc es of variation, test to failure 2. Demand variables data, system performance modeling, measure design margin, robust design, optimi zation: then you can skip compl iance testing! 3. If variables data is unavailable, challenge that! 4. For attribute data: select risk-based confidence/reliability levels, perform com pliance testin g or else cite the work done during characterization! 5. Control processes to ensure that our system robustness does not deteriorate over time and th at we are alerted to assignable causes of variation if th ey occur 83 | MDT Confidential
Impact of Measur ement Err or (Imprecision) 2
observed
2
=
product
+
2
measurement error
Process capability study variation is inflated by measurement error (gage repeatability & reproducibility). Therefore, if an independent gage R&R study has been completed, then subtract the measurement error from the observed process capability variation to estimate true product variation: 2
product
=
2
observed
-
2
measurement error
84 | MDT Confidential
42
Capability Analysis: Summary PROCESS CAPABILITY: THE NATURAL VARIABILITY IN A PROCESS. VARIABLES DATA Cp, Pp: Measure of process potential (for a centered proc ess) Cpk, Ppk: Measure of actual process capability PPM estimates from PROCESS CAPABILITY INDICES ASSUME THAT THE PROCESS IS STABLE AND FOLLOWS A NORMAL (BELL-SHAPED) DISTRIBUTION. Make sure there are no shif ts or tr ends If the data are not normal try other parametric distribution s (Weibull or lognormal) or Box-Cox transformation If those fail, consider the data as attribute Consider the impact of sample size and how the data was collected (short-term vs. lo ng-term) when making in ferences – use confidence bounds to incorporate uncertainty in estimates ATTRIBUTE DATA The prop orti on (p bar) from the P chart is the process capability. 85 | MDT Confidential
Summary Quiz True or False
___________
___________
___________
You can ignore plotting the data and just compute Ppk. The “Total Exp. Overall” ppm are the same for log-normal data in Minitab for both of these approaches: 1) the Normal Capability branch (with lambda=0) and 2) the non-normal Capability branch and selecting Log-normal. The smaller the sample used to compute Ppk, the better. It is less work to collect the data .
86 | MDT Confidential
43
Summary And Recap Measuring Process Capabili ty Sigma Scale, Z scores, DPM=PPM • Process Capability Indices • (Cp, Cpk, Pp, Ppk) • Impact of Normality & Process Stabili ty Attr ib ute Data Non-normal Data Minitab Assistant Impact of Sample Size (Confidence Limit s) Comparison to Tolerance Intervals Impact of Measurement Error
87 | MDT Confidential
44
Chapter 4B: Tolerance Intervals
Topics • Tolerance Intervals – Calculations – Sample Size
2 | MDT Confidential
1
Statistical Tolerance Intervals • From the Medtronic Handbook of Statistics: – For variables data, a statistical tolerance interval places limits on the variation expected in individual items from a population. – A tolerance interval is described by two parameters: confidence level and population fraction (sometimes called “reliability”, for fraction meeting spec(s) )
3 | MDT Confidential
Tolerance Intervals – New in Minitab 16 • A new feature in Minitab 16 is the calculation of tolerance intervals using a normal-distribution assumption. – The normal distribution assumption is critical. Unlike confidence intervals which are somewhat unaffected by lack of normality, tolerance intervals are completely dependent upon it.
4 | MDT Confidential
2
Normal distribution Tolerance Interval
5 | MDT Confidential
Statistical Tolerance Intervals • From the Medtronic Handbook of Statistics, cont’d: – If the data is not normal, transformations should be tried to obtain normality. For example, if the data were lognormal then tolerance intervals could be constructed on the log of the data. – If the underlying population distribution is known but is not normal then reliability/distribution analysis techniques can be used. – Tolerance Intervals generally should be • Two-sided if the specification is two-sided • One-sided if the specification is one-sided
6 | MDT Confidential
3
Tolerance Interval Calculation • First determine appropriate data distribution or transformation • For Normal distribution or transformation to normal distribution – Use Stat -> Quality Tools -> Tolerance Interval
• For other distribution (e.g. Weibull) – Use Stat -> Reliability/Survival -> Parametric Distribution Analysis
7 | MDT Confidential
Example: Tolerance Intervals • Use Ch1DataFile.mtw • Variables TubeTensile1, TubeTensile2, TubeTensile3
8 | MDT Confidential
4
Step 1: Identify Distribution • Stat -> Basic Statistics -> Normality Test
9 | MDT Confidential
Step 2: Calculate tolerance bound Lower Tolerance Bound for TubeTensile1
5
Tolerance Interval Output – TubeTensile1
11 | MDT Confidential
A Very Confusing Output in Minitab 16 Tolerance Interval Plot for Tensile Bond 1_1 95% Lower Bound At Least 95% of Population Covered Statistics N M ea n StDev
30 4 1. 06 3 10.127
Normal 0
20
40
60
Lower
18.583
Nonparametric Normal
L o we r
7 .5 60
Normality Test
Nonparametric 0
10
20
30
40
50
60
AD P-Value
70
0.772 0.040
Normal Probability Plot 99 t n e c r e P
90 50 10 1 0
10
20
30
40
50
60
70
Tell everyone you know who uses Minitab: The 95%/95% statement on the display ONLY applies to the Normal-distribution interval, not the Nonparametric Interval
Must Look in Session Window:
12 | MDT Confidential
6
Try this using Summarized Data
13 | MDT Confidential
Try using other Sample Sizes
95/95 Nonparametric One-sided requires n=59 95/95 Nonparametric Two-sided requires n=93 14 | MDT Confidential
7
Exercise • Compute 95/95 lower tolerance bounds for – TubeTensile2, TubeTensile3
• Compute 95/90 lower tolerance bounds for – TubeTensile2, TubeTensile3
• Compute 95/95 two-sided tolerance intervals for – TubeTensile2, TubeTensile3
15 | MDT Confidential
Using “Summarized Data” option to Evaluate Sample Size for Tolerance Intervals • Imagine having the following historical data on the pull strength of an electrode to plan a future study using Tolerance Intervals – – – –
A normal distribution assumption is appropriate The historical mean is 4.92 lbs The historical standard deviation is 0.87 lbs The lower specification limit for pull strength is 2.0 lbs
16 | MDT Confidential
8
Sample Size Evaluation for NormalDistribution Tolerance Intervals • Ask: – How likely are these results to predict the results of the future study? – Will the future study run under the same conditions? Worst case? – Would that affect the mean or standard devation we expect?
Sample Size for Normal Distribution Tolerance Intervals • For example, might decide to use a larger standard deviation value, say 1.10 (approximately 25% larger) as the planning value • Need to know confidence and reliability to demonstrate. For example, let’s use 95% confidence and 95% reliability. • Start with n=30 and see if that sample size would be large enough . . .
18 | MDT Confidential
9
Sample Size for Normal Distribution Tolerance Intervals
Since the one-sided tolerance interval is above the specification value of 2.0, n=30 is large enough
Now try smaller sample sizes . . .
n=14 is the smallest sample size that produces an interval above 2.0
10
Exercises • Choose a sample size for – Normal distribution tolerance interval – One-sided specification: Min 3 lbf – Planning data: TubeTensile3
• Choose a sample size for – Normal distribution tolerance interval – Two-sided specification: 3.5 to 4.0 – Planning data: Spacing4
21 | MDT Confidential
Tolerance Interval for Non-Normal Distributions • If a normal-distribution model is not appropriate for the data, then either – Transform the data to Normal • Use the (normal distribution) Tolerance Interval module
– Or identify a non-normal distribution (e.g. Weibull) • Use Stat -> Reliability/Survival -> Parametric Distribution Analysis • Use confidence intervals on percentiles to determine the Tolerance Interval Limits
22 | MDT Confidential
11
Tolerance Intervals via Reliability/Survival Menu One-sided
• Lower 95% / 95%: – Calculate one-sided lower 95% confidence bound on the 5th percentile
• Upper 95% / 95%: – Calculate one-sided upper 95% confidence bound on the 95th percentile
Two-sided
• Two-sided 95% / 95%: – Calculate two-sided confidence intervals for 2.5th and 97.5th percentiles. – Lower bound is the lower 95% bound on the 2.5 th percentile – Upper bound is the upper 95% bound on the 97.5 th percentile
23 | MDT Confidential
Weibull Tolerance Interval • Use Stat -> Quality Tools -> Individual Distribution Identification • Data was randomly generated from Weibull with shape 2 and scale 25. • All except Normal fit well • Imagine that due to subject-matter knowledge, Weibull is believed to be the best model
24 | MDT Confidential
12
Tolerance Intervals via Reliability/Survival
25 | MDT Confidential
Weibull 95/95 Lower Bound
26 | MDT Confidential
13
Weibull 95/95 Two-sided Tolerance Interval
Interval is 2.03 to 61.51 27 | MDT Confidential
Sample size for Weibull Tolerance Interval • See Medtronic Corporate Statistical Resources – Work Aid #2
28 | MDT Confidential
14
Summary and Review • Tolerance Intervals – Calculations – Sample Size
29 | MDT Confidential
15
General Linear Models (GLM) I feel like I’m regressing
LeRoy Mattson Jeremy Strief
Objectives • Understand how GLM is a generalization of ANOVA and regression • Understand three primary concepts within GLM models – Fixed vs. Random effects – Nesting vs. Crossing – Covariate (Continuous) vs. Factor (Attribute)
• Fit GLM in Minitab
2 | MDT Confidential
1
Recap from Quality Trainer • One-Way A NOVA • Two-Way ANOVA • Correlation & Regression
3 | MDT Confidential
Statistical Tools for Analyzing key Xs X Variables (continuous) ) a t a d s u s o e u l n b i t a Y n i r o a c V ( ) a t a d e e t t u e r b c i r s t t i d A (
Attribute (discrete)
Regression
t-test (1 X, 2 levels)
Multiple Regression
One-way ANOVA
GLM
GLM
Logistic Regression
Chi Square Logistic Regression
4
2
General Linear Models
GLM: Concepts GLM: Variable Y – One Attribute X GLM: Variable Y – Two Attribute Xs GLM: Variable Y – Mixture of Attribute & Variable Xs
GLM Introduction • GLM stands for General Li near Model • A flexible, unified approach to regression and ANOVA. • Needed when building a Y=f(X) transfer function, but when the input variables don’t match a standard regression or ANOVA approach: – Regression assumes continuous X’s – ANOVA treats X’s as attributes, and it often requires a balanced experimental design in Minitab – What if your dataset does not fit into the ANOVA or Regression mold? 6 | MDT Confidential
3
Motivating Example Pin Pulls.mtw
• MECC began collecting data around pull strength for a particular component. • Due to the nature of the investigation and due to resource constraints, it was not possible to execute a formal DOE. • Data were collected over a series of months, and sample sizes were not equally distributed across all the engineering conditions of interest. (So the dataset is unbalanced, in DOE language.)
7 | MDT Confidential
Motivating Example • Response variable (Y): Pull Strength • Predictor Variables (Xs): – Hole diameter: 17.5, 18.5, or 19.5 – Fillet Style: one-sided or two-sided – Solder size: small or large
• Fillet style and Solder size are attribute metrics • Hole diameter is a variables/continuous metric
8 | MDT Confidential
4
Data are unbalanced Tabulated statistics: hole diameter, 1 or 2 sided fillet, solder size Results for solder size = 1
Rows: hol e di ameter
17. 5 18. 5 19. 5 Al l
1
2
4 0 4 8
3 0 2 5
Col umns: 1 or 2 si ded f i l l et
Al l 7 0 6 13
Results for solder size = 2
Rows: hol e di ameter
17. 5 18. 5 19. 5 Al l
1
2
9 4 16 29
4 0 12 16
Col umns: 1 or 2 si ded f i l l et
Al l 13 4 28 45
9 | MDT Confidential
How to Analyze in Minitab? • With multiple X’s of various types, GLM is the only method which can be used to analyze the data in Minitab • JMP also offers flexible modeling platforms through “Custom Design” and “Fit Model”
10 | MDT Confidential
5
Three Main Concepts in GLM • Predictor variables (Xs) can be characterized in three ways: – Fixed vs. Random effects – Nesting vs. Crossing – Covariate (Continuous) vs. Factor (Attribute)
11 | MDT Confidential
An Unfortunate Naming Convention • In statistical literature, there are two types of models whose names are confusingly similar. • The General Linear Model is the main topic of today’s talk. – Y is continuous – X can be continuous or categorical
• The Generalized Linear Model is a further abstraction of the General Linear Model. – Y can be continuous or categorical – X can be continuous or categorical – Subcategories of Generalized Linear Models are • Logistic regression for a binary Y • Poisson regression for a count-based Y • General linear model for a continuous Y
• The Advanced SME class will focus on the General Linear Model in Ch 5 and on Logistic Regression in Ch 6. 12 | MDT Confidential
6
General Linear Models
GLM: Concepts GLM: Variable Y – One Attribute X GLM: Variable Y – Two Attribute Xs GLM: Variable Y – Mixture of Attribute & Variable Xs
Topics to be covered GLM: Variable Y – One Attribute X – One-way ANOVA (review) – GLM approach – Random effect vs. Fixed effect model
14
7
One Attribute X Example •
Project Goal : Reduce late deliveries (>36 hrs.) from suppliers MINITAB®
SupplierLT.mtw
15
One attribute X: Example
Is there a practical difference among several suppliers?
% of Lead Time variance explained by variation in Supplier means
16
8
GLM approach to one attribute X • Model: yij = μ + ai + eij
where i represents factor level for A
17
Minitab Output
Expected lead time for “Blitz” : Y = 35.092 - 7.323(1) = 27.769 Expected time for “Hare” : Y = ? Expected time for “Wild” : Y = 35.092 - 7.323(-1) + 5.023(-1) - 3.134 (-1) + 7.716(-1) + 1.686(-1) = 31.125 18
9
GLM: multiple comparisons
19
Multiple comparison
20
10
Capabilities of ANOVA vs GLM Capability
ANOVA
GLM
Can fit unbalanced data
no*
yes
Can specify factors as random and obtain expected means squares
yes
yes
Fits covariates
no
yes
Performs multiple comparisons
no*
yes
* Except for one-way ANOVA
21
GLM has some limits Just like the one-way ANOVA: •
residuals shoul d be distribut ed normally
•
residuals shou ld not have a pattern when plotted against the predicted Y
•
residuals shou ld not have a pattern when plotted in run order
Just like regression: •
one should check that that factors aren’t hig hly correlated
•
one should simplify the model. 22
11
What are Random Effects? • Random X – X is random factor when levels of X are randomly chosen from a population of possible levels. – Inferences are made on the overall population of Xs, rather than on the specific levels chosen for the experiment. – Random effect models focus on estimating variance components. How much variation in Y is due to X? There is less concern with estimating the mean for any particular level of X. Example: Selecting a random sample of 3 operators and a random sample of 5 parts for Gage R&R Study i n MSA
23
What are Fixed Effects? • Fixed X – The specific levels used in the experiment will be controlled and replicated in a real manufacturing situation. – There are only a few discrete levels of X which are of scientific interest, or there only a few discrete levels of X which can actually be produced in the real world. – We are specifically interested in estimating the mean value of Y for a given value of X.
24 | MDT Confidential
12
Fixed vs. Random Quiz 1. MECC wishes to understand the impact of two different material suppliers upon weld penetration. Based on the specific performance of each supplier, MECC intends to establish a long-term contract with one or both suppliers. –
Supplier is a _____ effect for the response of weld penetration.
2. In a Gage R&R study, we select three operators from a pool of 30. We are not interested in the specific performance of the 3 operators in the experiment; we wish to understand the variability due to operator. –
Operator is a _____ effect.
25 | MDT Confidential
Common Examples in Manufacturing • Fixed Effects: – Designs – Suppliers – Material types – Controllable process settings (e.g. laser power, position, etc.)
• Random Effects: – Lots – Operators – Subsampling from a finite population of levels – Noise variables (uncontrollable aspects of a process) 26 | MDT Confidential
13
Random Effect vs. Fixed Effect •
Example: Fiber Strength Data :
•
Model: yij = μ + ai + eij
MINITAB®
Loom.mtw
Var(y) = a2 + 2 Random Effect Model
•
Objective of Random Effect Model:
Estimate
2
a
&
2
27
One-way ANOVA for fiber strength data
na2
28
14
GLM for Random Effect Model MINITAB®
Loom.mtw
29
Compare to the manual results
30
15
General Linear Models
GLM: Concepts GLM: Variable Y – One Attribute X GLM: Variable Y – Two Attribute Xs GLM: Variable Y – Mixture of Attribute & Variable Xs
Topics to be covered GLM: Variable Y – Two Attribute Xs – Two-way ANOVA – GLM approach – Crossed vs. Nested design
32
16
Example: Two Attribute Xs Problem Statement: Customer service call center staffing often too high (waste) or too low (low customer satisfaction). Project Goal: Improve call center forecast accuracy. accuracy. Accurate forecast is within 20 calls of actual. Path Y: Y: Calls Received Xs
MINITAB®
Day (Monday to Friday)
Shift
Call Center Attribut e XR.mtw XR.mtw
1 (21:00-3:00) 2 (3:00-9:00) 3 (9:00-15:00) 4 (15:00-21:00)
Monday begins at 21:00 on Sunday, etc. 33
ANOVA ANOV A approach
Y
interaction default between Xs
Xs
34
17
ANOVA ANOV A Output • yijk = µ + a i + b j + abij + eijk
35
General Linear Model Approach
Y
Xs
36
18
General Linear Model, cont.
p < 0.05 (Day (Day and Shift are Key Xs)
p < 0. 0.05 05 (Da (Day*S y*Shift hift interaction is signific ant)
37
GLM: Main Effect Plot
38
19
GLM: Main Effect Plot, cont Main Effects Plot (fitted means) means) for Calls Received Rece ived Day
90
Shift
80
Since interaction is significant, these plots do not tell the whole story!
d e v 70 i e c e R s 60 l l a C f 50 o n a e M40
30
Each point is the mean number of calls received for that day
Each point is the mean number of calls received for that shift
20 Mon
Tue
Wed
Thu
Fri
# of Calls received decreases by day of week
# of calls received is lower for 1st shift
1
2
3
4
39
GLM: Interaction Plot
40
20
GLM: Interaction Plot, continued
Interaction = Lines NOT Parallel
Each line is a different day
Each line is a different shift
• Shift 1 appears to have more calls calls on Monday than other days • Since p < 0.05 for Day*Shift, Day*Shift, this observed interaction interaction is significant • Effect of Shift depends on on Day. Day. Effect of of Day depends on Shift. Shift. 41
Example: Statistical Impact of X on Y:
2
•
2
Shift
= 29278.4/39776.0 29278.4/39776.0 = 73.6% of of the variation variation in calls calls received
•
2
Day
= 1473.8/39776.0 1473.8/39776.0
•
2
Day*Shift = 5412.4/39776.0 = 13.6% of the variation variation in calls received
= 3.7% of the the variation variation in calls received
42
21
Exercise: All Xs Attributes Y Variables MINITAB®
Days Overdue.mtw
Project Goal: Improve On Time Delivery to Customer Project Strategy: Path Y = Days Overdue Xs: X1 = Product (1 or 2) X2 = Priority (1 to 4), 1 = Highest Priority, 4 = No Priority Task: App ro ach:
Perform ALL steps of Analyze using the data Work alone or in small groups. 15 Minutes
43
Exercise Debrief Solution: What are the key Xs? What is the relationship between the key Xs and Y What is the impact of the key Xs on Y?
What was difficult?
44
22
Residuals Verify Assumptions Days Overdue.mtw
MINITAB®
45
Residuals Verify Assumptions Verify Equal Variance Assumption (Want no patterns) Residual Plots for Missed Days
Verify Normality Assumption (want fit to line)
Normal Probability Plot of the Residuals 99
t n e c r e P
90 50 10 1 0.1
Residuals Versus the Fitted Values l a u d i s e R d e z i d r a d n a t S
99.9
-4
-2 0 2 Standardized Residual
4
Hist og ram of t he Resid uals
y 18 c n e u 12 q e r F
6 0
-2
-1 0 1 Standardized Residual
2
1.5 0.0 -1.5 -3.0 -15
-10
-5 0 Fitted Value
5
R esid uals Versu s t he Ord er of t he Dat a l a u d i s e R d e z i d r a d n a t S
24
3.0
3.0 1.5 0.0 -1.5 -3.0 1
10
20
46
30 40 50 60 70 Observation Order
80
90 100
Verify Independence Assumption (Want no patterns)
23
Another two attribute Xs example: Gage R&R MINITAB®
Micrometer.mtw
Design : Crossed design Model : Random Effect model
47
Nesting • Factor B is nested in factor A if the levels of B have different meanings for each level of A. • Stated differently, factor B is nested in factor A if there is a completely different set of levels of B for every level of A. • Minitab notation: “B(A)” means B is nested within A.
48 | MDT Confidential
24
Nesting Example • Example: An experiment is run with three suppliers, each of which produces three batches of material. – There clearly are three levels of supplier, but how many levels of batch are there? – Batch 1 from supplier 1 has nothing to do with batch 1 from supplier 2. Batch “level 1” has no consistent meaning across suppliers. So Batch is nested in supplier. – Instead of labeling the batch levels as 1-3, it would be appropriate to label them 1-9.
• You know that B is nested if A it is reasonable to label each level of B differently, depending on the level of A. 49 | MDT Confidential
Crossing • Factor B is crossed with Factor A if the levels of B have the same meaning for each level of A. • This is the standard factorial structure of a DOE • Example: An experiment is run with three suppliers, each of which utilizes two types of material—100% gold or 100% nickel. – Gold and Nickel have the same meaning and same interpretation, regardless of supplier. – Supplier is therefore crossed with material.
50 | MDT Confidential
25
Example of Nested Design
•
Levels of Batches nested within levels of supplier – Is this a factorial design? – Can we estimate Supplier X batch interaction?
yijk = µ + a i + b j(i) + ek(ij) 51
Nested Design - continued • Company buys raw material in batched from 3 different suppliers. The purity of this material varies considerably. Which causes problems in manufacturing the finished product. We wish to determine if the variability in purity is attributable to difference between the suppliers.Four batches of raw material are selected at random from each supplier, and 3 determinations were made on each batch. MINITAB®
Purity.mtw
52
26
ANOVA for the Purity data yijk = µ + ai + b j(i) + ek(ij)
A = Fixed or Random ?,
Is Supplier a key X?
= 1.62
batch
B = Fixed or Random ?
=?
Is there differences among suppliers? 53
Incorrect GLM Analysis Supplier and batch fixed effects Two-way ANOVA: purity versus supplier, batch
Source
DF
SS
MS
F
P
supplier
2
15.056
7.52778
2.85
0.077
batch
3
25.639
8.54630
3.24
0.040
Interaction
6
44.278
7.37963
2.80
0.033
Error
24
63.333
2.63889
Total
35
148.306
S = 1.624
R-Sq = 57.30%
R-Sq(adj) = 37.72%
54
27
GLM Exercise: MINITAB®
(Purity.mtw)
Is supplier a key X?
Assume that suppliers were randomly chosen (i.e., random effect), and estimate supplier using GLM. 55
Summary: Different Types of Xs
I X at a time:
F/R
2 or more Xs at a time: F/R C/N
F = Fixed
C = Crossed
R = Random
N = Nested
56
28
Specifying the Model Terms in Minitab
example
Statistical model
Terms in model
Factor A, B crossed
Yijk= μ+ ai + b j + abij + eijk
A, B, A*B
Crossed and nested (B nested within A, both crossed with C)
Yijkl = μ + ai + b j(i) + ck + ac jk + bc jk(i)
A, B(A), C, A*C, B*C
+ el(ijk)
57
Exercise MINITAB®
Time.MTW
Problem Statement: Assembly time is too long for a manufacturing process. Type of layout and type of fixture are suspect Xs for assembly lead time. Two (2) different layouts and three (3) different fixtures are to be tested. Two(2) groups of 4 Operators each are randomly selected to test each layout with the 3 fixtures, two times. All factorial combinations of layout and fixture are completely randomized in the experiment. Task: Are type of fixture and layout key Xs for assembly time? Experimental Design: L1 O1 O2 O3 O4
L2 O5 O6 O7 O8
F1 F2 F3 Time: 20 minutes 58
29
General Linear Models
GLM: Concepts GLM: Variable Y – One Attribute X GLM: Variable Y – Two Attribute Xs GLM: Variable Y – Mixture of Attribute & Variable Xs
Topics to be covered GLM: Variable Y – Mixture of Attribute and Variable Xs – GLM with Covariates – Strategic GLM
60
30
When Can I Treat an X as Variables? When
relationship between X and Y can be described with a line or curve
Number
of levels does not determine Variables X vs Attribute X
Attr ib ute (Facto r) X
Variables (Covari ate) X 7
Coffee Taste
6
) s 30 y a D ( 25 e 20 m i T 15 d a 10 e L 5
Curve Y=F(X)
e t s 5 a T
Actual data
4
quadratic
3 0
10
20
30
40
Lead Time vs Supplier
35
50
60
1
2
3
4
5
6
7
8
9
10
Supplier
Brew To Serve Time
X = Brew To Serve Time has 3 levels (1, 30, 60)
No Line or Curve Y=F(X)
Actual data
X = Supplier has 10 levels (1 to 10) 61
GLM for Mixture of Attribute and Variable Xs •
Specify variable Xs as covariates
Example: MINITAB®
Catapult Multipl e X.mtw Are th ere sig ni fican t main effects? interactions? curvature?
62
31
Analyze Centering Xs - Main Effects
Covariates tells MINITAB which Xs are variables 63
Analyze Xs - Main Effects Sour ce Rub Band Shot Oper at or Bal l Ti me PB PB Angl e Er r or Tot al
DF 3 1 1 2 1 1 37 46
Seq SS 3784. 8 8. 3 55. 7 684. 3 336. 8 10108. 2 4957. 1 19935. 2
Adj SS 5991. 3 61. 3 40. 2 572. 2 1. 4 10108. 2 4957. 1
Adj MS 1997. 1 61. 3 40. 2 286. 1 1. 4 10108. 2 134. 0
F 14. 91 0. 46 0. 30 2. 14 0. 01 75. 45
P 0. 000 0. 503 0. 587 0. 133 0. 919 0. 000
Rub Band, PB Angle are sign ificant Main Effects Ball is close (include it for now ) Remember: p-values will change when terms are added or deleted from model
64
32
Reduce Terms Edit Last Dialog
Tells MINITAB to give coefficients for Attribute as well as Variables Xs
65
Reduce Terms Sour ce Rub Band Bal l PB Angl e Er r or Tot al
DF 3 2 1 40 46
Seq SS 3784. 8 691. 8 10348. 8 5109. 8 19935. 2
Adj SS 5988. 0 681. 2 10348. 8 5109. 8
Adj MS 1996. 0 340. 6 10348. 8 127. 7
F 15. 63 2. 67 81. 01
P 0. 000 0. 082 0. 000
Ball p-value smaller with Shot, Operator, Time PB removed from model
Should we keep Ball?
66
33
What If We Treat PB Angle as Attribute?
Source Rub Band Bal l PB Angl e Err or Tot al
DF 3 2 4 37 46
Seq SS 3784. 8 691. 8 12097. 9 3360. 7 19935. 2
Adj SS 5212. 1 900. 0 12097. 9 3360. 7
Variable DF Adj SS
1
Adj MS 1737. 4 450. 0 3024. 5 90. 8
P 0. 000 0. 012 0. 000
Attribute 4
10348.8
12097.9
F
81.01
33.30
p
0.000
0.000
40
37
Error DF
F 19. 13 4. 95 33. 30
67
What If We Treat PB Angle as Attribute? Ter m Const ant Rub Band 1 2 3 Bal l Gol f Wi f f l e PB Angl e 130 140 150 160
Coef 99. 896
SE Coef 1. 531
T 65. 24
P 0. 000
- 14. 661 19. 518 3. 354
2. 725 2. 942 2. 597
- 5. 38 6. 63 1. 29
0. 000 0. 000 0. 204
7. 570 - 5. 102
2. 664 2. 031
2. 84 - 2. 51
0. 007 0. 016
- 32. 106 - 4. 533 7. 497 9. 455
3. 035 2. 920 3. 129 3. 669
- 10. 58 - 1. 55 2. 40 2. 58
0. 000 0. 129 0. 022 0. 014
Model Prediction (Rub Band = 1, Ball = Wiffle, PB Angle = 150) Distance = 99.896
-14.661
-5.102
+7.497
= 87.63
Model Prediction (Rub Band = 4, Ball = Golf, PB Angle = 180) Impossible: Can only get PB Angle predictions for 130,140,150,160,170 68
34
Interactions
If sample size is small try interactions one at a time
69
Interactions Sour ce Rub Band Bal l PB Angl e Rub Band*Bal l Er r or Tot al
DF 3 2 1 6 34 46
Seq SS 3784. 8 691. 8 10348. 8 187. 5 4922. 3 19935. 2
Adj SS 4423. 8 502. 4 8766. 5 187. 5 4922. 3
Adj MS 1474. 6 251. 2 8766. 5 31. 2 144. 8
F 10. 19 1. 74 60. 55 0. 22
P 0. 000 0. 192 0. 000 0. 969
Rub Band * Ball Interaction not significant Note: 6 DF (degrees of freedom) for Rub Band*Ball = 3 * 2 34 DF left for for Error = 46 - 3 - 2 - 1 - 6 If DF for Error decreases then p values increase If DF for Error < 0 then no p values are possible (MINITAB will complain!) Conclusion: Be careful when adding interactions (DF for Error may reach 0) 70
35
Interactions p-values for Interactions
Rub Band
Ball
Ball
0.969
-
PB Angle
0.566
0.211
Conclusion: No significant interactions
Should we test interactions with Shot? Operator? Time PB? 71
Review: Linear vs Curvature Main Effects Plot (data means) for Taste
Curvature only applies to variables Xs!
5.5
Point Type Corner Center
5.0
Curvature Model
e t s 4.5 a T f o n a 4.0 e M
3.5
Linear Model
3.0 1.0
30.5
60.0
Brew to Serve Time
Quadratic Model: Y = aX 2 + bX + c 72
36
Curvature - Detecting with Residuals
Plot Residuals vs PB Angle to graphically check for curvature 73
Curvature - Detecting with Residuals Residuals Versus PB Angle (response is Distance) 3 l 2 a u d i s e R 1 d e z i d r a 0 d n a t S
-1
-2 130
140
150 PB Angle
160
170
Looks like curvature Now lets prove it!
74
37
Curvature - Add X2 Term to Model
(PB Angle) 2
75
Curvature - Add X2 Term to Model Note: Ball main effect is now significant
Source Rub Band Bal l PB Angl e PB Angl e*PB Angl e Err or Tot al
DF 3 2 1 1 39 46
Seq SS 3784. 8 691. 8 10348. 8 1360. 4 3749. 3 19935. 2
Adj SS 6036. 3 1060. 9 1685. 1 1360. 4 3749. 3
Adj MS 2012. 1 530. 5 1685. 1 1360. 4 96. 1
F 20. 93 5. 52 17. 53 14. 15
P 0. 000 0. 008 0. 000 0. 001
(PB Angle)2 is significant
76
38
Final Model - Check Residuals We now have all terms for our model. Need to check residuals to understand how good the model is?
77
Final Model - Check Residuals Brush over to find which point is causing trouble!
Residual Plots for Distance Normal Probability Plot of the Residuals
Residuals Versus the Fitted Values l 4 a u d i s e R 2 d e z i d r 0 a d n a t S -2
99 90 t n e c 50 r e P
10 1 -2
0 2 Standardized Residual
4
60
Hist og ram of t he R esid uals
80
100 Fitted Value
120
140
Resid uals Versu s t he Ord er of t he Dat a l 4 a u d i s e R 2 d e z i d r 0 a d n a t S -2
16 y 12 c n e u 8 q e r F
4 0
-1
0 1 2 3 Standardized Residual
4
1
5
10
15 20 25 30 35 Observation Order
40
45
What do we conclude? 78
39
Exercise •
Twelve steel brackets were randomly divided into three groups and sent to three vendors to be zinc plated. The chief concern in this process is whether or not there is any difference in zinc thickness among vendors. The following table lists the plating thickness (Y), as well as the thickness of the bracket (X), in hundred-thousandths of an inch.
MINITAB®
Zinc plating.mtw
79
Exercise : Questions 1) One-way ANOVA : X = vendor Is there significant differences among vendors? 2) GLM: X1= vendor, X2 = Bracket Thickness How does this change the conclusion? 3) Bonus Questions: If you are to do this testing again, what would you do differently? Use a graphical tool to support your rationale (Suggestion: try Interaction Plot under ANOVA)
80
40
Problems with Designs: Correlated Xs
81
Return to MECC Example Pin Pulls.mtw • Response Variable (Y): Pull Strength • Predictor Variables (Xs): – Hole diameter: 17.5, 18.5, or 19.5 – Fillet Style: one-sided or two-sided – Solder size: small or large
• Exercise: – Fit a GLM to create a model for pull strength – Can Hole diameter be reasonably treated as a covariate? (Engineering theory suggests that it can.) – Determine if variables are fixed vs. random, crossed vs. nested – Which X’s are statistically significant? 82 | MDT Confidential
41
Summary And Recap • Understand how GLM is a generalization of ANOVA and regression • Understand three primary concepts within GLM models – Fixed vs. Random effects – Nesting vs. Crossing – Covariate (Continuous) vs. Factor (Attribute)
• Fit GLM in Minitab
83 | MDT Confidential
42
Logistic Regression I still feel like I’m regressing LeRoy Mattson
Objectives • Understand how logistic regression creates a predictive model for an attribute Y • Fit logistic regression models in Minitab
2 | MDT Confidential
1
Logistic Regression
Logistic Regression – Attribute Y, One X Logistic Regression – Attribute Y, Multiple Xs
Attribute Y Data Types
Individual unit categorized into a classification Finite number of possible values Cannot be subdivided meaningfully
4 Attribute data types:
Binary (pass/fail, good/bad) Nominal (complaint codes, problem type) Ordinal (low/medium/high, mild/moderate/severe) Discrete(# errors)
4
2
Is Smoking (X) a key X for Lung cancer (Y)? Y: Lung Cancer; Yes, No X: Smoking; Yes, No
X\Y
Lung Cancer
No Lung Cancer
Total
Smoker
2
3
5
Non-smoker
1
8
9
Anal ysi s t oo ls : Relat ive Risk or Odd s Rati o
The Relative Ris k of lung cancer for
smoker vs non-smoker = (2/5)/(1/9) = 3.6
5
Concept: Odds Ratio (OR) as a Measure of X Impact for Attribute Y OR = Odds of Y outco me in one group relative to another group = Odds of cancer for smokers odds of cancer for non -smokers OR =
2/3 1/8
=
0.67 0.125
= 5.33
Interpretation of Odds Ratio: 5.33 • Odds of cancer for smokers is 5.33*odds for non-smokers • Chance of getting cancer is increased 433% with smoking
6
3
Statistical Tools for Analyzing key Xs X Variables (continuous) ) a t a d s u s o e u l n b i t a i n r o a c V ( ) a t a d e e t t u e r b c Y i r s t t i d A (
Attribute (discrete)
Regression Multiple Regression GLM
t-test (1 X, 2 levels) One-way ANOVA GLM
Logistic Regression
Chi Square Logistic Regression
7
Example – Binary Logistic Regression Attribute X Problem Statement: Is smoking associated with disease in previous example? Y? X? MINITAB® Smoking.MTW Task:
In this module… 1) What tool(s) for Hypothesis Test? 2) What tool(s) for Graphical Analysis?
App ro ach :
Work individually.
Y = Cancer o r Cancer-Free X = Exposure (smoking/nonsmoking)
8
4
Logistic Regression Analysis
9
Logistic Logis tic Regression Regression Analy Analysis sis – cont cont.. Wald Test to Verify Key X:
If p-value < 0.05, X is Key (Smoking is not Key X)
OR for Attribute X Impact: OR of disease for exposed relative to un exposed = 5.33 5.33 (433% (43 3% increase in odds of di sease for smoking r elative to nonsmoking) 10
5
Exercise Exerci se – Binary Logistic Regression Regression Attribut Attribute eX Problem Statement: Statement: A new data data set on smoki ng has been collected. Anal yzed th is d ata set and det determ erm in e if s mo ki king ng has an effec ef fectt on on cancer. MINITAB®
Smoking2.MTW
11
Example – Binary Logistic Logistic Regression Variable X Problem Statement: Toy company is interested in whether a toy missile will hit flying targets of varying speeds. Y? X? MINITAB® Speed.MTW Task:
In this module… 1) What tool(s) for Hypothesis Test? Test? 2) What tool(s) for Graphical Analysis?
App ro ach :
Work individu individually. ally.
Y = Hit or Mis s? (1/0) X = Target Target s peed
12
6
Incorrect Analysis : Variable ariables sY Fitted Line extends beyond 0 and 1 Fitted Line Plot hit or miss = 1.562 - 0.003005 target speed (cms/sec) S R -S q R-Sq(adj)
1.0
Heteroscedastic (Unequal) Variances
0. 397278 41. 8% 39.3% 3%
0.8
s s 0.6 i m r o t 0.4 i h
Residual Plots for hit or miss Normal Probab robabili ility ty Plot Plot of the Residu esidual als s
90
t n e c r 50 e P
0.0 200
250
300 350 400 target speed (cms/sec)
450
500
10 1 -2
-1 0 1 Standardized Residual
2
0.00
Histogra togram of of the the Re Residua d uals
4.5 y c n e u 3.0 q e r F 1.5
-2
0.25
0.50 Fitted Value
0.75
1.00
Residua d uals Versus the Orde rder of the the Data ata l a 2 u d i s e 1 R d e z 0 i d r -1 a d n a -2 t S
6.0
0.0
Resi Residua duals ls Versus ersus the Fitted itted Values l a u 2 d i s e 1 R d e z 0 i d r -1 a d n a -2 t S
99 0.2
-1 0 1 Standardized Residual
2
2
4
6
8 10 12 14 16 18 20 22 24 Observation Order
13
Correct Analysis Analysis : Use Binary Logisti c Regression Regression Whatt if w e analyze Wha analyze propo rtion of hit s? Logit (p)
Proportion p vs X 0.9
b1 < 0
0.8
Logit(p) vs X 3
logi t (p) = b 0 +b 1X
2
0.7
) 0.6 p ( n o 0.5 i t r o p 0.4 o r p
1
) p ( t i g 0 o l
0.3
-1
0.2 -2
0.1 0.0 200
250
300
350 X
400
450
200
500
p(x) = proport proportion ion of Y-attributes at Logit transformation straightens
250
300
350 X
400
450
500
each X value S-shape to straight line
Logit(p) = loge[(p/(1-p)] Logistic f(x) f(x):: log e [p(x)/(1-p(x))] = b 0 +b 1X Origins: Verhulst (mathe (mathematician) matician) named the logist ic f unction (18 (183838-184 1847: 7: 3 papers). Pearl and Ree Pearl Reed d (1920, (1920, Johns Hopki ns, Biometry and Vital Statistics) rediscovered Logistic to model population growth in US 14
7
Binary Logistic Regression
Does target speed affect hit or miss?
Raw data (0,1)
Declare Attribute Xs
Fitted probabilities stored as EPRO1 Default: Defa ult: Event is “ 1” 15
Identif Ide ntif ying Key Xs Link Function: Logit
Wald Test to Verify Key X:
Response Information Variable
Value
Count
hit or miss
1
13
0
12
Total
25
If p-value < 0.05, X is Key (Event)
(Target speed is Key X)
Logistic Regression Table Odds Predictor Constant target speed (cms/sec)
Coef
SE Coef
Z
P
5.56028
2.04130
2.72
0.006
-0.0156619
0.0055920
-2.80
0.005
Ratio
0.98
95% CI Lower Upper
0.97
1.00
Log-Likelihood = -11.411 Test that all slopes are zero: G = 11.796, DF = 1, P-Value = 0.001
Compare with GLM: Wald test in Logistic is simi lar to t-test in GLM 16
8
Measuring Me asuring X Impact – Odds Ratio Ratio for X (OR) (OR) Link Function: Logit
OR for Variables X Impact:
Response Information Variable
Value
hit or miss
1
13
0
12
Total
25
OR for a c uni t incr ease in X = e(c*b1) = (Odds
Count (Event)
Ratio Ra tio for 1 unit X inc rease rease))c (Need to determine meaningful c)
Logistic Regression Table Odds Predictor Constant target speed (cms/sec)
Coef
SE Coef
Z
P
5.56028
2.04130
2.72
0.006
-0.0156619
0.0055920
-2.80
0.005
Ratio
0.98
95% CI Lower Upper
0.97
1.00
Log-Likelihood = -11.411 Test that all slopes are zero: G = 11.796, DF = 1, P-Value = 0.001
For 50 unit i ncrease in target speed, risk (chance) of h itting target = (0.98) (0.98) 50 = 0.46 (i.e., a 54% reduction) 17
Graphical Gra phical Analysis – Plot logistic curve
18
9
Graphical Analysis – Plot logistic Curve Check against raw data
Scatterplot of hit or miss, EPRO1 vs target speed (cms/sec) Variable hit or miss EPRO1
1.0
0.8
Fitted Logistic Curve 0.6
a t a D Y 0.4 0.2
0.0 200
250
300 350 400 target speed (cms/sec)
450
500
Target speed = 350, 50% chance of hitting target 19
Exercise: Binary Logistic Regression Problem Statement: Chemotherapy induced remission rate takes too long
to measure and is inaccurate causing delays and errors in cancer research. Project Strategy: Determine if labeling index is a variables path Y for
remission rate. Labeling index measures the proliferative activity of cells after a patient receives an injection of thymidine as part of chemotherapy. It represents the percentage of cells that are “labeled” (Lee, 1974). MINITAB®
Cancer Remissi on.MTW
Task: App ro ach : Time:
(from Lee*)
Verify if labeling index is a suitable Path Y for remission rate. Work individually or in pairs 10 minutes
* Lee (1974): A computer p rogram for linear logi stic regressio n analysis. Computer Prog. Biomed. 4: 80-92.
20
10
Exercise Debrief Solution:
1.
Is labeling index a Key X?
2.
What hypothesis test did you use to verify the key X?
3.
Compare the results from fitted line plot.
4.
Is impact of labeling index large enough to use as a Path Y for remission rate?
What did you learn?
21
Logistic Regression
Logistic Regression – Attribute Y, One X Logistic Regression – Attribute Y, Multiple Xs
11
Example: Multiple Binary Logistic Regression A cancer study showed the number of cases of esophageal cancer, classified by age group and alcohol consumption (0=none, 1=some). Y? Xs? Data type? MINITAB®
EsophagealCancer.MTW
Task: Verify if age group and alcohol
consumption are key Xs for incidence of esophageal disease. Alcohol is Attribute X 23
Study Effect of alcohol consumption/age on cancer cases
Fitted Line Plot cancer % = 14.40 + 0.1286 age S R-Sq R-Sq(adj)
30
8.45436 9.2% 0.0 %
25
% r e c 20 n a c 15
10 20
30
40
50 age
60
70
80
24
12
Are Xs correlated (potential confounding of Xs) Scatterplot of alc% v s age 90 80
Correlated Xs (corr = 51%)
70 60
% c 50 l a 40 30 20 10 20
Rows: Alcohol 25 0 6 18.18 1 27 81.82 All 33 100.00
30
40
50 ag e
Columns: Age Group 35 45 55 3 30 9 14.29 83.33 30.00 18 6 21 85.71 16.67 70.00 21 36 30 100.00 100.00 100.00
60
65 27 81.82 6 18.18 33 100.00
75 18 66.67 9 33.33 27 100.00
70
All 93 51.67 87 48.33 180 100.00
80
% alcohol use depends on age-group.
25
Example: Multiple Binary Logistic Regression A cancer study showed # cases of esophageal cancer, classified by age group and alcohol consumption (0=none, 1=some). Y? Xs? Data type?
Alcohol is Attribute X
MINITAB®
EsophagealCancer.MTW
Task: Verify if age group and alcohol
consumption are key Xs for incidence of esophageal disease.
26
13
Example – Hypothesis Test OR for Attribu te X Impact = Odds Ratio Logistic Regression Table
Predictor Constant Alcohol 1 Age Group
Coef -2.72159
SE Coef 0.753215
Z -3.61
P 0.000
0.733554 0.0187340
0.406715 0.0119209
1.80 1.57
0.071 0.116
Odds Ratio
95% CI Lower Upper
2.08 1.02
0.94 1.00
4.62 1.04
Log-Likelihood = -87.915 Test that all slopes are zero: G = 4.314, DF = 2, P-Value = 0.116
Risk (chance) of getting
Alcohol is Key X
cancer increases by 108% with alcohol use
27
Graphical Analysis – Raw data and Logistic Regression Estimates
Scatterplot of EPRO1 vs Age Group Alco hol 0 1
0.35
0.30
1 0.25 O R P E
Fitted Logistic Model
0.20
0.15
0.10 20
30
40
50 Age Group
60
70
80
28
14
Exercise: Logistic Regression Problem Statement: A sample of ingots are treated with four levels of heat
time and five levels of soak time. The response is number of ingots ready to be rolled (out of those tested) for each combination of times. Project Goal: maximize the ingots ready to be rolled. MINITAB®
Task: App ro ach : Time:
Ingots.MTW
Verify if heat time and soak time Key Xs. Work individually or in pairs 10 minutes
29
Summary Quiz
True or False
___________
Use Binary logistic regression when Y is Variables
___________
Odds ratio is odds of Y outcome in one group relative to another group
___________
Use GLM analysis when Y is attribute at 2 levels
30
15
Statistical Resources
Avoiding wheel re-invention LeRoy Mattson
Objectives • Ensure you are aware of statistical resources both internal and external to Medtronic: – Medtronic Statistical Resources Web Site – External Web Sites
• This chapter can serve as a reference document after the class is complete.
2 | MDT Confidential
1
Medtronic Statistical Resource Web Site
http://mitintra.corp.medtronic.com/corporate-statistics/
3 | MDT Confidential
Software Validation Plans & Reports
For links to Validation Plans & Reports: Click on Search button on Web Site • For Medstat Plans/Reports: Enter “Medstat validation” • For Minitab Plans/reports: Enter “Minitab validation” • For Crystal Ball Validation Plans/Reports :Enter ”Crystal Ball validation” Note: These links are to pdf documents stored in Documentum.
4 | MDT Confidential
2
About Corporate Stats
5 | MDT Confidential
About Corporate Stats Cont.
6 | MDT Confidential
3
Get Trained
7 | MDT Confidential
Recap from Quality Trainer • If you comp lete all o f the Quality Trainer , Minitab wil l send you a Certif icate . It takes 20-40 hour s t o c ompl ete all of the QT.
8 | MDT Confidential
4
Get Trained Cont.
9 | MDT Confidential
Tools/Resources: Software
10 | MDT Confidential
5
Tools/Resources: Minitab16 Validation
11 | MDT Confidential
Tools/Resources: Work Aids
12 | MDT Confidential
6
Tools/Resources: MHOS Medtronic Handbook of Statistics - Rev G. : in pdf format only
13 | MDT Confidential
Tools/Resources: Business SOPS Business Unit procedures for: Test Method Validation (MSA included) Normality Testing Lot Acceptance (or sampling plans for incoming) SPC
14 | MDT Confidential
7
Tools/Resources: Other Software
Miscellaneous
15 | MDT Confidential
Tools/Resources: JMP Software • Contact Kevin Gaffney at MECC if you are int erested in obtain ing JMP.
• The software has been officiall y validated and may be used within th e quality system. • JMP tends to be mor e interactive than Minitab and is m ore pow erful for certain appli cations (e.g. advanced DOE). • JMP is point-and-click like Minitab, but it is more “ objectoriented” instead of “ menu-oriented.”
16 | MDT Confidential
8
Get Connected
17 | MDT Confidential
Get Connected Industrial Statistics Questions? 1) Contact your division’s Industrial Statistics Council member
2)Otherwise, contact Medtronic Statistical Resources
18 | MDT Confidential
9
External Web Sites Statistical Standards : can be purchased as part of a CD Rom collection available at http://www.iso.org/iso/pressrelease.htm?refid=Ref1134. ISO Standards for Statistical Methods
ASTM Stan dar ds on pr ecision and Bias 6 th edition http://www.astm.org/BOOKSTORE/COMPS/BIAS08.htm
ASTM SPC Stand ard http://www.astm.org/Standards/E2587.htm
19 | MDT Confidential
External Web Sites Statistical Standards: ASQ has ANSI/ASQ stan dar ds : http://asq.org/quality-press/display-item/index.html?item=T004 GHTF has standards (link to GHTF proc ess validation below ) http://www.ghtf.org/sg3/sg3-final.html AIAG h as Guid ance for MSA & SPC Publications Catalog - Automotive Industry Action Group Large list of Acceptance Sampling Standards : http://variation.com/techlib/standard.html
Has list of acceptance sampling stats standards: MIL-STD & ANSI & ISO Bulk sampling & reliability are listed.
20 | MDT Confidential
10
External Web Sites Statistical Committees: ISO Statistics Technical Committee: TC 69 - with six subcommittees http://www.iso.org/iso/home/standards_development/list_of_iso_technical_c ommittees/iso_technical_committee.htm?commid=49742
The Six ISO TC69 Subcommittees
ASTM Techni cal Com mittee E11 (Stati stics) http://www.astm.org/COMMIT/COMMITTEE/E11.htm USP Expert Statistics Committee
http://www.usp.org/council-experts-expert-committees-overview/expertcommittees/statistics
21 | MDT Confidential
External Web Sites Handbooks: NIST E-Statistics Handbook (has hyperlinks) http://www.itl.nist.gov/div898/handbook/
22 | MDT Confidential
11