Advanced Statistics Manual PDF

ADV A DVA A NCED STATISTICAL METHODS FOR ENGINEERS

Chapter Zero Welcome to Advanced Statistical Methods for Engineers!

Ground Gro und rule rules s – plea please… se… • •

• • • • • •

Use name tents Cell phones: – Turn off off or use vibrate – Take phone calls outside Keep Ke ep si side de co conv nver ersa sati tion ons s to a mini minimu mum m Be pro promp mptt in in retu return rnin ing g fro from m bre break aks s Don’ Do n’tt do do oth other er wo work rk du duri ring ng cl clas ass s Let inst instruc ructor tor kno know w if you you need need to to leave leave for for more more than than 30 30 minute minutes s List Li sten en wi with th an ope open n and and ac acti tive ve mi mind nd… … If yo you u have have a que quest stio ion n at at any any ti time me,, ask! – Other Ground Rules wanted by students?….. – Class agree to these Ground Rules? 2

1

Agenda Day 1 8:00 9:00

Ch 0: Welcome Welcome

Day 2 Ch 3: Distribution Distribution Analys is

Day 3

Day 4

Ch 5: Regression Regression and GLM

Ch 6: Logistic Regression

Ch 1: ANOVA ANOVA and Equivalence Testing Ch 7: Statistical Statistical Resources

10:00

End of Day Review

11:00 12:00 1:00

Online Evaluations

Lunch on your own

Lunch on your own

Lunch on your own

Ch 2: Measurement Systems Analysis

Ch 4: Process Process Capability and Tolerance Intervals

Ch 5: Regression Regression and GLM continued

2:00

Lunch on your own

3:00 4:00

En d o f Day Rev ie iew



5:00

Breaks as Needed 3

Logistics • Startin Starting g Time: Time: 8:00 • Ending Time: Time: Not later later than 5:00 5:00 • Lunc Lunch h 12:0012:00-1:00 1:00 • Breaks every 90-120 minutes • Powe Powerr Out Outlet lets s • Rest Room Room Locat Location ion • Food and drink locations locations (snacks, cafeteria cafeteria,, etc)

4

2

You Need ... – Laptop with MINITAB MINITAB and a working wireless Internet Connection – Writing instruments instruments – Access to data files files

5

Icebreaker (5 Minutes) In my journey through the world of statistics…

One thing that has worked well for me is …

One thing that has been a challenge for me is … (Extra Credit)

My favorite statistician, living or dead, is . . .

My favorite statistics joke is …

6

3

Expectations – Tools, tools, tools… •

Course may overlap with material from DRM or Lean Sigma

•

Tools may be familiar, but the intent is to present the tools with a focus on statistical thinking and decision-making.

•

Topics may be explored in greater mathematical depth than is offered in other curricula.

– Benefits •

A deep mathematical dive can actually help you better see the surface.

•

Awareness of mathematical assumptions is a critical first step for growing in your statistical knowledge, but advanced practitioners need to know:

•

–

Which assumptions are most critical?

–

When is it appropriate to break the rules?

–

What are the consequences of breaking the rules?

Statistical sophistication allows for flexibility and creativity in problem solving. 7

Expectations – Experience Chart •

Mark an X in column that best describes your experience with each topic Topic

None

A Little

Comfortable Proficient

Icould teach it

EquivalenceTesting Tolerance Intervals ANOVA Signal Interpretation Measurement Systems Analysis Distribution Analysis

– Your Expectations •

Create a list at your table

•

Each table will report

•

Spokesperson: skip items already mentioned

Process Capability General Linear Models

– Time: 10 Minutes 8

4

Your Feedback is Critical • September 17-20 represents the first wave of Advanced SME at MDT • Given that many of you already are leaders in the statistical or DRM worlds, your suggestions for course improvements are extremely important! • At the end of each day, we will engage in brief feedback session. • At the end of the week, there will be an online survey for you to formally evaluate the course. • If you wish to provide more detailed feedback, please send an email to the instructor team: Leroy Mattson, Karen Hulting, Jeremy Strief, Tom Keenan, Grant Short, Dayna Cruz

9 | MDT Confidential

What questions do you have?

10

5

Chapter 1: ANOVA and Equivalence Testing

Topics • Quality Trainer Review • ANOVA – Assumptions – Using Minitab Assistant vs Stat Menu – Calculation Deep Dive – Sample Size – ANOVA Signals

• Equivalence Testing


1

Quality Trainer Review


Comparing Grouped Data: Variables Data Response


2

ANOVA: ASSUMPTIONS


One-way ANOVA: Testing for the significance of one factor •

The null hypothesis: – H0: μ1 = μ2 = … μk – Meaning that the population (response) means are equal at each of the k levels of this factor or the factor is NOT significant.

•

The alternative hypothesis: – H A: at least two population means are unequal – Meaning that the factor IS significant

•

Perform the One-way ANOVA and reject the null hypothesis if the p-value is < alpha – Usually alpha = 0.05 (or 0.10 or 0.01) – A way to remember: “If p is low – the null must go”.


3

ANOVA: General Process Steps • Select a model • Plan sample size using relevant data or guesses • (Optional) Simulate the data and try the analysis • Collect real data • Fit the model (perform ANOVA and get p value) • Examine the residuals • Transform the response or update the model, if necessary • State conclusion 7 | MDT Confidential

Typical Assumptions for ANOVA Factors • Factors (or “Inputs”) – Each factor can be set to two or more distinct levels – Factor levels can be measured adequately – Factor levels are “fixed” rather than “random” – For multiple factors, all combinations of all levels are represented (levels are “completely crossed”)


4

Typical Assumptions for ANOVA Responses • Response data is “complete”, not censored • Some software requires “balanced” data – same sample size for each level of the input factor • Assumptions on Residuals – Residual = Response – Fitted Value – Normally distributed – Equal variance (assumption relaxed in Minitab Assistant) – Independent (e.g. no time trend)


ANOVA CALCULATIONS DEEP DIVE: STAT MENU & MINITAB ASSISTANT


5

ANOVA Calculations • See www.khanacademy.org – ANOVA 1 – Calculating SST (7:39) – ANOVA 2 – Calculating SSW and SSB (13:20) – ANOVA 3 – Hypothesis Test and F Statistic (10:14)


Minitab Analysis of Kahn Dataset Can arrange either Stacked or Unstacked


6

Consider a PQ Dataset • Three runs of n=10 units produced and tensile tested • See Ch1DataFile.mtw • Columns TipTensile1, TipTensile2, TipTensile3


Minitab Options • Could use – Stat -> ANOVA – -> One way – -> One way (Unstacked) – -> General Linear Model – Stat -> Regression -> General Regression – Minitab Assistant

• Data arrangement – Stacked (one column for X, one column for Y) – Unstacked (Y values in columns for each X)


7

ANOVA using Minitab Statistics Menu


Stat Menu Outputs

S, R2 and adjusted R2 are measures of how well the model fits the data. 16 | MDT Confidential

8

Judging model fit •

S is measured in the units of the response variable and represents the standard distance data values fall from the fitted values – For a given study, the better the model predicts the response, the lower S is

•

R2 (R-Sq) describes the amount of variation in the observed response values that is explained by the predictor(s) – R2 always increases with additional predictors. – R2 is most useful when comparing models of the same size

•

Adjusted R2 is a modified R 2 that has been adjusted for the number of terms in the model – R2 can be artificially high with unnecessary terms, while adjusted R 2 may get smaller when terms are added to the model – Use adjusted R2 to compare models with different numbers of predictors


Comparisons Output


9

ANOVA – Examining Residuals 1) Test for Normality Normal Probability Plot is a Straight line

2) Test for Equal Variances Residual vs. Fitted Values is evenly distributed around the 0 line

Using the Stacked arrangement, there would also be a 4th Residual plot – Time Order. This is a Test for Independence – looking for a pattern over time.

Residuals are strongly non-normal . . . Possible Causes: • Failure of Equal Variance Assumption • Outliers • Missing Important Factors in the Model • Data is from Non-Normal Population What to do? • Check for Outliers • Check if Equal Variance is satisfied • Perform Normality Test • If data is from Non-Normal Population consider using Non-Parametric Tests or Transform the Response variable

10

If Residuals differ Group to Group

Possible Causes: • Non-Constant Variance • Outliers • Missing Important Factors in the Model

What to do? • Test for equal variance assumption using Stat > ANOVA > Test for Equal Variances • If test indicates unequal variances then consider transforming the response variable • Verify if the outlier is a data entry error • Add the factor into the model

If there is a time pattern in the data . . .

What to do? • Prevent by Randomizing • A time effect may be present • Consider time series procedure

11

Common Transformations Transformation

Comments

y

Appropriate for Poisson Distributed Data

Log(y)

If the Response is exponentially increasing then this transformation is appropriate

1/y

Appropriate when responses are close to zero

sin

1

Called the Arcsine Square Root function. Appropriate when Response is a proportion between zero and one.

y

Another useful tool is Box-Cox Transformation

Minitab Box - Cox Procedure : Y   Y



,

when  

0

Y   log e(Y ), when  

0

Minitab Screenshots

Box-Cox Transformation in Minitab Minitab > Stat > Control Charts > Box-Cox Transformation

Box-Cox Plot of Data 1

12

Lower C L

Upper CL Lambda (using 95.0% confidence) Estimate

10

Lower CL Upper CL

8 v e D t S

R o u nd e d V a u l e

0.03 -0.30 0.38 0 . 00

6 4 2 Limit 0 -1

0

1

2

3

Lambda

12

ANOVA using Minitab Assistant

http://www.minitab.com/support/documentation/Answers/Assistant%20White%20Papers/OneWayANOVA_MtbAsstMenuWhitePaper.pdf 25 | MDT Confidential

Report Card


13

Diagnostic Report


Power Report


14

Summary Report


ANOVA - Exercise • Use Ch1DataFile.mtw • Test for differences between the group means using both Stat menu ANOVA and Minitab Assistant ANOVA . . . for these 3-lot PQ studies: – For TubeTensile1, TubeTensile2, TubeTensile3 – For Diameter1, Diameter2, Diameter3

• What are your conclusions?


15

ANOVA – Alternate Exercise Analyze this data two ways: 1) Assistant and 2) Stat>ANOVA Note: Stat>ANOVA assumes equal variances (and so may need tranformations), but Minitab Assistant ANOVA does no assume equal variances. An article in the IEEE Transactions on Components, Hybrids, and Manufacturing Technology (Vol. 15, No. 2, 1992, pp. 146-153) described an experiment in which the contact resistance of a brake-only relay was studied for three different materials (all were silver-based alloys). Alloy-Contact Resistance.MPJ

Alloy-Contact Resistance.MPJ

Test at a alpha = 0.01 level Does the type of alloy affect mean contact resistance?

Applied Statistics and Probability for Engineers, 4th Edition, Douglas C. Montgomery and George C. Runger

General Regression can be used for ANOVA

Use for multiple regression – more than one X

General regression can handle: 1) all continuous input(s), 2) all categorical input(s), 3) a mixture of continuous and categorical inputs, and 4) a non-normal response (it allows for the Box-Cox transformation of the response). The response must be continuous or considered as continuous.

16

General Regression: Example of ANOVA Note: A bloc ked One-way ANOVA is a tw o way ANOVA where on e factor’s effect is to be “ blocked out “ . The randomization is done within each block. Background: The forces exerted by three different stylets in a lead is compared at 4 different positi on/advancement cond itions (blo cks). The data is given below : Perform an ANOVA analysis using Stats>Regression>General Regression and determine if: (1) there are significant differences between different stylets, and if (2) the blocking factor employed was effective.

Condition is the Block

Condition 1 2 3 4 x

Force in Grams Stylet 1 Stylet 2 Stylet 3 18.1 14.5 14.0 20.0 16.1 16.3 30.2 27.5 26.8 42.5 39.4 38.7 27.70 24.38 23.95

stylet.MTW

Stylet.MTW

Blocked One-way ANOVA

x

17

Blocked One-way ANOVA

(1) (2)

Are there are significant differences between different stylets? Is the blocking factor employed effective?

SAMPLE SIZE FOR ANOVA


18

Planning Sample Size in ANOVA

Sample Size for One-Way ANOVA Example • • • •

Fill in the number of levels for the factor Always fill in Standard Deviation (use conservative estimate) Then fill in two of the three long boxes Can specify several values, separated by spaces

19

Sample Size for One-Way ANOVA

RESPONDING TO ANOVA SIGNALS


20

Statistical vs. Practical Significance • Key idea in any hypothesis testing effort – If the test detects a difference (a “signal”), then what? – Don’t assume the signal is automatically bad news (if you’re hoping for consistency) or good news (if you’re hoping for a change) • For example, “ANOVA Failure” in PQ

– Examine the size of the signal in the appropriate context . . . determine the “practical” significance of the difference – The appropriate response depends on an assessment of both statistical and practical significance


ANOVA Signal in PQ • There was a realization that a significant p-value in the comparison of lot means should not necessarily mean the PQ fails • Analysis sometimes included to assess the “power” of the ANOVA and the practical significance of the difference in the means. • Eventually, Corporate Policy on Manufacturing Process Validation added the “ANOVA Failure Flow Chart”


21

2008 Version of Corporate Guideline for Manufacturing Process Validation


2012 Version of CRDM ANOVA Signal Flow Chart


22

Pros and Cons • Pro – Provides a consistent way to address the question of practical significance – Relatively Simple – Effective – expect the approach to stand up to regulatory scrutiny

• Con – Can be very prescriptive – Standards for Ppk are quite high: 95% confidence bound on Ppk > 1.33 – Disincentive for larger sample size 45 | MDT Confidential

Current approaches • Corporate Guideline phased out • CV procedure still has essentially the same ANOVA Signal Flowchart • CRDM originally had a more prescriptive version • CRDM currently has a simplified version • Would also work to include a discussion of the sample size of the ANOVA and the practical significance of the difference • Discussion – other businesses?


23

Example of ANOVA Signal Flow Chart • Recall the ANOVA exercise on Ch1DataFile.mtw for TubeTensile1, TubeTensile2, TubeTensile3


ANOVA Signal Flow Chart Ppk Analysis First Stack the 3 lots using Data -> Stack -> Columns Then run Stat -> Quality Tools -> Capability Analysis -> Normal

Add confidence interval for Ppk using Options button 48 | MDT Confidential

24

Next steps • Total sample size is 90, so use confidence bound • Lower 95% confidence bound on Ppk is 0.92 • Must make 3 more runs – TubeTensile4, TubeTensile5, TubeTensile6 – These must pass tolerance interval analysis (like the first three runs did) – All six runs pass tolerance interval analysis


Conclusion

Note: Ppk analysis of all six lots is not required. Included here FYI. 50 | MDT Confidential

25

Exercise: ANOVA Signal • Run ANOVA and assess practical significance for – In Ch1DataFile.mtw, analyze • WireTensile1, WireTensile2, WireTensile3 • Specification is 3 lb minimum

– Use one of the ANOVA Signal Flowcharts – Then use another approach to determine the practical significance of the difference between the means – Conclusion?


ANOVA: Summary And Recap • Review Quality Trainer • Calculations Deep Dive into ANOVA • Analytically, ANOVA is a special case of Regression • Sample Size • ANOVA Signal Flow chart – some Medtronic divisions use one to standardize response to ANOVA Signal in PQ


26

EQUIVALENCE TESTING


Statistical Logic for Equivalence • The basic statistical logic is designed to disprove equality. – Null hypothesis: Two population parameters are equal, e.g. μ1 = μ2. – Alternative hypothesis: Two population parameters are not equal, e.g. μ1 ≠ μ2.

• We need a different form of logic to affirmatively prove equivalence. – Null hypothesis: Two population parameters differ by Δ or more, e.g. |μ1 - μ 2| ≥ Δ. – Alternative hypothesis: Two population parameters differ by less than ∆, e.g. |μ1 - μ 2| < Δ. 54 | MDT Confidential

27

Equality vs. Equivalence Part of the confusion around the issue of equivalence is that the concepts of equality and equivalence may not be distinguished. – Equality: Two values/processes are mathematically identical. – Equivalence: The difference between two values/processes is sufficiently small that it can be deemed practically insignificant.


Approach 1: Confidence Intervals • The idea is to demonstrate that the confidence interval for the difference of i nterest is fully contained within the range of practical significance [-Δ,Δ].


Jones, BMJ 1996

28

Approach 1: Confidence Intervals • Step 1: Define Practical Significance – Before collecting data, use scientific/engineering principles to decide what difference, Δ, is practically negligible.

• Step 2: Estimate Sample Size for Experiment – Based on characterization data or other assumptions, estimate the sample size needed to produce a confidence interval fully contained within [-Δ,Δ]. (Stat << Power and Sample Size << Sample Size for Estimation)

• Step 3: Collect Data and compute confidence interval. – If the confidence interval is a strict mathematical subset of [-Δ,Δ]. equivalence may be declared. If not, equivalence is either uncertain or untrue.


Example of Approach 1 • •

Two processes will be declared equivalent if the difference in their mean outputs is less than 3 micrometers. So Δ=3. Based on characterization data,

– The old process can be modeled as Normal with a mean of 30 and a standard deviation of 2. – The new process can be modeled as Normal with a mean of 31 and a standard deviation of 1. – Based on mathematical theory, the distribution of (new – old) must also be Normal with a mean of 1 and a standard deviation of sqrt(5) = 2.24. – To be conserv ative in sampl e size estimati on, the standard deviation is ro unded up to 3. – With an expected mean difference of 1, we need the confidence interval to have a half-width (margin of error) of 2 or less. 58 | MDT Confidential

29

Example of Approach 1 Met hod Par amet er Di st ri but i on Standard devi ati on Conf i dence l evel Conf i dence i nter val

Mean Normal 3 ( esti mate) 95% Two-s i ded

Resul t s Mar gi n o f Er r o r 2

Sampl e Si z e 12

We need n=12 from BOTH processes.


Example Output Two- sampl e T f or New vs Ol d N New 12 Ol d 12

Mean 30. 927 29. 19

St Dev 0. 858 1. 52

SE Mean 0. 25 0. 44

Di f f erence = mu (New) - mu (Ol d) Est i mate f or dif f erence: 1.735 95%CI f or di f f erence: ( 0. 671, 2.798) T- Test of di f f er ence = 0 ( vs not =) : T- Val ue = 3. 44

P- Val ue = 0. 003

DF = 17

• Conclusions: • The processes are statistically different (p=0.003), which is a statement about non-equality . • Despite being unequal, the processes are still equivalent. The 95% confidence interval for the difference in means is (0.671, 2.798), which is a strict subset of [-3, 3] 60 | MDT Confidential

30

Approach 1: Summary Summary • The confi confidence dence inte interval rval appro approach ach is the the gold gold standard for clinical trials and other high scrutiny experiments requiring FDA approval. • It is mathem mathematic atically ally equi equivale valent nt to a p-valu p-value-dri e-driven ven approach called TOST (Two One-Sided T-tests). • The confi confidence dence inte interval rval appro approach ach is is easier easier to understand than the original form of TOST.


Post-hoc Problems • Rigorous Rigorous appli applicati cation on of appro approach ach 1 requi requires res that that the Δ value be established before collecting data. • What shou should ld we do when when data data have have alread already y been been collected without defining the difference of interest or planning sample size?


31

Approach 2: 2: Retrospective Retrospective Power Analysis • When data hav have e already already bee been n collecte collected d without without planning for rigorous “equivalence testing”, equivalence may be assessed by displaying an entire power curve. • Eve Even n if this this approac approach h does not not set set a-priori a-priori standa standards rds for equivalence, – it provides additional context for an insignif icant p-value – it can help engineering experts to make dec isions

• Subject Subjective ive judgme judgment nt will will be required required to to determi determine ne if the experiment was suitably powered to demonstrate equivalence. • A powe powerr curve curve is a useful useful supp suppleme lement nt to a tradit traditiona ionall analysis, but it does not match the rigor in approach 1. 63 | MDT Confidential

Approach 2 Method Method • After After collectin collecting g the mean means s and stand standard ard deviati deviation on of the observed data, create a power curve through the Power and Sample Size platform in Minitab. • Display Display and and interp interpret ret the the Power Power Curve Curve in your your data analysis report. • You may may honestly honestly belie believe ve that that your experi experiment ment was sufficiently powered (>80%) to detect meaningful differences, differences, but the post-hoc nature of the analysis makes your argument weaker. weaker. 64 | MDT Confidential

32

Example • • •

Consider Consid er again again our our old and and new proces processes ses whic which h have 2 2 distributions of N(30,2 ) and N(31,1 ), respectively. Suppose Suppo se we forgot to take take approach approach 1 and instead instead just colle collected cted 5 data points from each process. We found found a stati statistic stical al diffe difference rence when we we collected collected 12 data data points, but the p-value goes above 0.05 when collecting only 5:

Two-sampl Two-s ample e T for New_5 vs Old_ Old_5 5 N

Mean

StDev

SE Mean

New_5

5

30.744

0.933

0.42

Old_5

5

29.42

3.02

1.4

Differenc Diffe rence e = mu (New_5) (New_5) - mu (Old_5) (Old_5) Estimate for difference: 95% CI for difference: difference:

1.32

(-2.61, 5.25)

T-Test of difference difference = 0 (vs not =): =): T-Value T-Value = 0.93

P-Value = 0.403

DF = 4


Power Curve Inputs • The obs observ erved ed sam sample ple siz size e is n=5 • Desir Desired ed power power level levels s are in the the range range of .8-.9 .8-.95 5 • The pool pooled ed stand standard ard devi deviati ation on is 2.24. 2.24.


33

Power Curve Output • With 80% pow power er,, this this exper experimen imentt could could have have detected a difference of about 4.5. • With 95% pow power er,, this this exper experimen imentt could could have have detected a difference of about 6. • It is a subjec subjective tive engin engineer eering ing judgmen judgmentt as to wheth whether er such values provide sufficient reassurance about the experimental results.


Extensions and Challenges • Confiden Confidence ce interv intervals als and and power power curves curves can be calcu calculate lated d for almost any type of statistical scenario: – Comparing 2 means – Comparing >2 means – Comparing standard deviations devi ations – Comparing reliability curves • Howe However ver,, the requi required red sampl sample e size for for provin proving g equivalence of standard deviations is often much larger than the sample size for means. • Equ Equival ivalence ence for for means means can reason reasonabl ably y be quantif quantified ied in terms of arithmetic differences (e.g. |μ1 – μ 2| < 5), but equivalence for standard deviations will be quantified in terms of multiplicative differences (e.g. ½ < σ1/σ2 < 2). 68 | MDT Confidential

34

Exerci Exe rcise se – Les Lesion ion Depth Depth • •

Consider Consi der the the key requi requirement rement for a new ablati ablation on catheter: catheter: equivalent (or greater) maximum lesion depth, compared to the current design, where the difference of interest is 0.5 mm. Prev Pr evio ious us da data ta sh show ows s – – – –

• •

Normal distribution model is adequate for Max Lesion Depth Current Design has average max lesion depth of 2.3 mm New Design has average max lesion depth of 2.2 mm Largest pooled standard deviation of max lesion depth is 0.356.

Follow Approa Follow Approach ch 1 to plan plan sample sample size size for the equival equivalence ence test test Assume Assum e test test data data as as follows follows to complet complete e the the equivale equivalence nce analysis – New: n=15, mean = 2.733, stdev = 0.342 – Current: n=15, mean = 2.723, stdev = 0.386

•

Stat St ate e your your co conc nclu lusi sion on


Alternate Exercise: Equivalence Testing Testing • Within Within your your team, team, iden identify tify an examp example le of equivalence testing in your own work. • Apply Apply Appr Approach oach 1, using using actual actual or made-up made-up characterization data for the the planning planning step. • Use Mini Minitab tab to simula simulate te data data coll collectio ection. n. – Hint: Use Calc Calc -> Random Random Data -> Normal . . .

• Use Minit Minitab ab to comp complete lete the Appr Approach oach 1 data data analysis. • Sta State te your your concl conclusi usion on from from the the data. data.


35

EQUIVALENCE Take Away Messages • An insignificant p-value is not a rigorous method of proving equivalence. • Ideally, practical significance and sample size should be considered before the experiment begins. • Rigorously proving equivalence first demands carefully defining the threshold ( ∆) of practical significance. • The most rigorous way to prove equivalence is to demonstrate that a confidence interval is fully contained within [- ∆, ∆]. • An alternative—but less formal—approach is to retrospectively perform a power analysis. • Don’t feel like you need to remember all the Minitab steps; we hope you remember the concepts and call your neighborhood statistician for further support.


Summary and Review • Quality Trainer Review • ANOVA – Assumptions – Using Minitab Assistant vs Stat Menu – Calculation Deep Dive – Sample Size – ANOVA Signals

• Equivalence Testing


36

Chapter 2: Measurement Systems Analysis

Topics • Quality Trainer Review • Topics with Variables Data – Gage R&R Sample Size – Probability of Misclassification (Variables Data) – Helpful Hints

• MSA for Destructive Tests • MSA for Attribute Tests


1

Quality Trainer Review


Value of Measurement Systems Analysis If your goal is . . .

then MSA helps by . . .

Reducing variability in Xs and Process Improvement Ys so that the “key” Xs may be discovered. Capability Demonstration or Estimation

More accurate measurements of process performance

Sorting Out Bad Product

Reducing the Probability of Misclassification

Innovation

Reduced noise allows discovery of more subtle signals


2

Recall . . . MSA Concepts •Bias – Mean (Delta – difference -- from reference) •Linearity – Mean (Bias vs Part or Operating Value) •Stability – Mean (Bias vs Time) •Repeatability – Standard Deviation •Reproducibility – Standard Deviation

…so linearity and stability should be plotted

•Gage R&R – Standard Deviation

…while bias, repeatability and reproducibility are just single numbers 5 | MDT Confidential

Gage Bias and Linearity • Bias is the difference between the average of repeated measurements and the “true value” • MSA tends to focus on Gage R&R (variability), but accuracy (= lack of bias) is equally important – Assumption that procedures for Calibration are in place - need to confirm – Assumption that procedures for Calibration are adequate – need to confirm

• “Linearity” is a study of bias across the range of measured values • In Minitab, use Stat -> Quality Tools -> Gage Study -> Gage Linearity and Bias Study 6 | MDT Confidential

3

Gage Stability MINITAB®

Snap Gauge.mtw

> Stat > Control Charts > Variables Charts for Subgroups > Xbar-R Measurement system is stable over time as evidenced by:

Xbar-R Chart of Rep1, ..., Rep3 0.254

UCL=0.253458

n a 0.252 e M e 0.250 l p m0.248 a S

_ _ X=0.2497

0.246

Xbar Chart - in control

LCL=0.245942

0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 : : : : : 1 : 5 1 : 5 1 : 5 1 : 5 1 : 5 1 1 1 1 1 p p p p p p p p e e e e p e p S e S e S e S e S e S S S S S 8 9 0 1 2 8 9 1 1 2 1 1 0 1 1 1

Day 0.0100

UCL=0.00946

e g 0.0075 n a R e 0.0050 l

_ R=0.00367

p m0.0025 a S 0.0000

R Chart - in control

LCL=0

0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 : 0 : : : : 1 : 5 1 : 5 1 : 5 1 : 5 1 : 5 1 1 1 1 1 p p p p p p p p e e e e p e p S e S e S e S e S e S S S S S 8 9 0 1 2 8 9 1 1 2 1 1 0 1 1 1

Day


GAGE R&R SAMPLE SIZE


4

Gage R&R Sample Size • General recommendation: – 5 to 10 Parts (P) – 2 to 3 Operators (O) – 2 to 3 Repeats (R)

• More rigorous methods – Specify minimum Degrees of Freedom for estimating Repeatability and Reproducibility standard deviations – Use confidence intervals for standard deviation estimates (option provided in Minitab 16) 9 | MDT Confidential

Degrees of Freedom Approach • Estimating Reproducibility Std Dev: O-1 – Include as many operators as feasible

• Estimating Repeatability Std Dev: P*O*(R-1) – With 30 df, 90% confidence bound on ratio of estimate to true value is (0.79, 1.21). Ref: on www.minitab.com search for “ID 2613” to access “Minitab Assistant White Papers.”

CVG Test Method Validation


5

PROBABILITY OF MISCLASSIFICATION


Misclassification Two Misclassification Probabilities • Probability of Misclassifying Bad Unit as Good • Probability of Misclassifying Good Unit as Bad LSL

USL

Probability of Misclassifying Good Unit as Bad Unit Probability of Misclassifying Bad Unit as Good Unit


6

MINITAB Simulated Estimation of Misclassification: Following Gage RR study 

Part mean = 30, Part Std Dev = 10, Part Upper Spec = 40



No measurement system bias



Gage R&R Std Dev = 2.6 1) Calc/Random Data/Normal (simulate true part measurements)

2) Calc/Random Data/Normal (simulate gage variability)


MINITAB Simulated Estimation of Misclassification (cont)

3) Calc/calculator/ use the “+” Add 1) + 2) to simulate observed measurements

4) Calc/calculator : assign a 1 for in spec for 1) Ex: (‘TrueMeasure’ ≤ 40)


7

MINITAB Simulated Estimation of Misclassification (cont) 5) Calc/calculator : assign a 1 for in specs for 2) Ex: (‘ObsMeasure’ ≤ 40)

6) Stat/Table/Crosstabs to crosstabulate 4) and 5).


MINITAB Simulated Estimation of Misclassification (cont) Estimated % of Truly Out of Spec called In Spec is 2.1%.

The simulation sample size was 10000. A larger sample size would be better.


8

MINITAB Misclassification



Two problems: 1) Only three decimals for probabilities( i.e. 0.000) 2) Can’t enter historical: 1) process mean 2) part std.dev 3) gage std.dev (Note: (2) can now be done with a CSR work aid 13) 18 | MDT Confidential

9

Misclassification Using Minitab and Work Aid 13 CSRworkaid13 POM.mtw MINITAB®

Load into the worksheet: the Part mean (30) and the Part Sigma (10)

and the Gage Sigma (2.6)




10

MINITAB Misclassification Enlarging the label on the sample mean chart, we see the mean is 30.


MINITAB Misclassification Examining the output we see that: USL 40, and the Part Sigma (10) and the Gage Sigma (2.6) . Prob. of a truly bad part called good is .021


11

Probability of Misclassification (POM) Tool • Originally written in R by Tarek Haddad to recreate functionality lost when Medstat was retired. • Jim Dawson collaborated with Tarek to continue development and turn it into an Excel tool. • A substantial Software Validation effort was undertaken by Nick Finstrom and Barry Christy, with the support of Pete Patel and the CVG Test Method Council. Validation work to be completed in early 2014. 23 | MDT Confidential

POM Tool • • • • •

Replicates Medstat functionality More resolution in results than Minitab Graphics Guardbanding Normal, Lognormal and Weibull distributions of parts


12

POM with Guardband


Exercise • Run POM analysis – Using Minitab Simulation – Using Work Aid 13 and Minitab GRR – Using POM Tool


13

HELPFUL HINTS


Gage R&R Helpful Hints - Normality • Normality testing is not needed for Gage R&R analysis – Distribution of the raw data will depend strongly on the parts used in the study – there no expectation or assumption that the raw data will to follow any specific distribution – Repeated measurements on the same part by the same operator will likely follow a normal distribution • Like any ANOVA model, the residuals are assumed to follow a normal distribution – but the analysis is relatively “robust” to non-normality of the residuals

– Probability of Misclassification does depend on the part or process distribution (each part measured once)


14

Gage R&R Helpful Hints – One-Sided Specification • In the case of a one-sided specification, the Percent Tolerance metric depends on the part average • Minitab uses the overall average in the Gage R&R study as the estimate of the part average • If the parts used in the study are not representative of the expected part distribution . . . – The overall average will be a poor estimate of the process average – The percent tolerance result will be misleading – Best practice would be to calculate Percent Tolerance separately using a better estimate of the process average – Being “not representative” can be a good practice – for example, including parts that don’t meet the specification


Corrective Actions for Failed Gage R&R • Repeatability problem – Could be due to part positional variation • Standardize by measuring same position on each part • Or make multiple measurements at random or systematic positions and use the average

– If gage itself is too variable, may need to improve or replace • In the meantime, Repeatability variability can be filtered out by taking repeated, independent measurements and using the average. Note that this approach does not correct for Reproducibility issues.


15

Corrective Actions for Failed Gage R&R • Reproducibility Problem – Look for assignable causes that explain the operator-to-operator differences – Understand any Operator*Part interactions – these may provide clues to differences in technique. – Possibly improve the measurement procedure and/or re-train the operators – Improve any visual aids or samples used in the measurement procedure


Approaches to Robust Gage R&R 

Standard Gage R&R methods assume that other factors that affect measurements have been studied and controlled in the development of the test method.



If these sources of variability still affect the measurements, then . . . 

The Expanded Gage R&R allows you to add additional factors. Besides operator & part, you could add fixture number, gage number or other factors. The Expanded GRR can also handle missing data.



Reference: “Make Your Destructive, Dynamic, and Attribute Measurement System work for you” by William Mawby. 



This book includes the Analysis Of Covariance method that allows one to load in the varying environmental factors like temperature & humidity (covariates) into a GRR.

The General Linear Model in Minitab (under the ANOVA branch) can be used to model covariates (also handles missing data).


16

MSA FOR DESTRUCTIVE MEASUREMENTS


Two Types of Destructive Measurements 1. Truly destructive: Measurement destroys unit being measured Pull test

In neither case is it possible to take repeated measures, so gage R&R is not possible.

Peel test Tensile test

2. Non-replicable: Measurement process can change the unit or you are measuring a transient phenomena Catapult distance Motor speed Heart rate Dimension of silicon part (can compress) Dimensions of heart tissue (can compress) Ref: Make Your Destructive, Dynamic, and Attribute measurement System Work for You. by. W. D. Mawby


17

Approaches to Destructive MSA App ro ach Develop a non-destructive measurement

Pro

Con

Ideal solution

Often difficult or impossible

Easy to apply usual Minitab calculations

Rarely works because parts aren’t actually identical

Use a coupon test so that parts are more identical

Results better than above

Coupons may not be representative – easier to measure than real parts

Focus on improving the measurement process using DMAIC

Proven methodology Cannot conclude whether measurement system is adequate

Focus on Reproducibility

Not affected by partto-part variability

Attempt to use identical parts as “repeat” measurements and apply usual requirements for GRR %Tolerance

Might miss a Repeatability issue


What about using “Nested” Gage R&R? •

The “nested” Gage R&R analysis applies when one operator measures different parts than another operator. – For example, John measures parts 1, 2, 3, 4, 5 repeatedly and Jane measures parts 6, 7, 8, 9, 10 repeatedly. – Common application would be “Inter-laboratory Testing,” where operators at each location measure different parts repeatedly. – Can work for Destructive MSA if each homogeneous sample may be sub-sampled. Then operators can measure different samples repeatedly.

•

Analysis – The nested analysis does not include a term for Part * Operator interaction. – Note that Minitab Assistant doesn’t offer the Nested analysis

•

Unless sub-sampling of homogeneous material is possible, Nested does not solve the key problem of Destructive MSA – It’s impossible to repeat the measurement


18

Destructive Gage R&R Example MINITAB®

TestingSupplierCoils.mtw

 Tensile

8

testing of tubing

pieces of tubing

 Each

tubing cut into 2 sub samples

 Assume

variation between sub samples due to measurement error

 Assume

an upper specification of

850 g 37 | MDT Confidential

Destructive Gage R&R using sub-samples


19


39| MDT Confi denti al


Nearly all measurement system variation due to repeatability rather than operator (reproducibility).. . . Or maybe sub-sample differences?

Large result for % Tolerance

Measurement system does not distinguish one part from another within the range of parts used in the study 40 | MDT Confidential

20

Destructive Gage R&R using sub-samples  Destructive

Gage R&R using subsamples gave poor results

 Since

repeatability accounts for most of the apparent measurement variation it is likely that parts were not very similar

 In

this project they used DMAIC Process Knowledge method to improve system without obtaining a formal measurement


Focus on Reproducibility • With destructive measurements, the Repeatability Standard Deviation always includes the part-to-part or subsample-to-subsample variation. In general, repeatability standard deviation cannot be accurately estimated. • If one population of parts is randomly assigned to multiple operators, then the Reproducibility Standard Deviation is not affected by part-to-part variation. • Reproducibility standard deviation can be estimated accurately even for destructive tests. 42 | MDT Confidential

21

Reproducibility • Stop – Trying to force (Repeatability + Part) Standard Deviation to be small enough to meet a requirement. – Trying to obtain or create “ident ical” parts.

• Start – Estimate Reproducibility standard deviation and ensure that it is small enough. This standard deviation depends only on the differences between operator means. – Compare operator standard deviations. Identify cases where operators show substantially different variation across equivalent sets of parts. 43 | MDT Confidential

Example: CVG Test Method Validation for Destructive Tests • Obtain a population of 40 parts – Do not need to get identical or nearly identical parts

• Randomly assign 10 parts to each of 4 operators • Calculate %Tolerance for Reproducibility – Compare to requirement of 25%

• Calculate Std Dev Ratio – Compare to simulation-based critical values (for typical study, critical value is 3.10 44 | MDT Confidential

22

Example Calculations • Data based on actual TMV studies – But altered to disguise – Detection Time A, Detection Time P


Detection Time A


23

Run One-Way ANOVA

•



Reproducibility

= sqrt((0.778-0.627)/10) = 0.123


Calculate Results

• % Tolerance (Reproducibility) = 100 * ((6*0.123)/2*(30-11.740)) = 100 * (.738 / 36.52) = 2.02% • Std Dev Ratio = 0.986 / 0.546 = 1.81 • Result: Pass 48 | MDT Confidential

24

Detection Time P


Calculations for Detection Time P •



Reproducibility

= sqrt((11.225-0.976)/10) = 1.01

• % Tolerance (Reproducibility) = 100 * ( (6*1.01) / 2*(30-14.798) ) = 100 * (6.06 / 30.40) = 19.9% • Std Dev Ratio = 1.113 / 0.846 = 1.32 • Result: Pass


25

Exercises • Open Destructive Exercises.mtw • For Bond Strength results: – Assume specification is Minimum 5 lb – Analysis • Individual Value Plot • % Tolerance for Reproducibility • Std Dev Ratio • Is this destructive measurement system adequate?

• Repeat for Buckle Force results – Assume specification is Maximum 340 grams 51 | MDT Confidential

MSA FOR ATTRIBUTE MEASUREMENTS


26

ATTRIBUTE GAGE R&R • Attribute data are usually the result of human judgment –

Which category does this item belong in?

• When categorizing items, you need a high degree of agreement on which way an item should be categorized • The best way to assess human judgment is to have all operators repeatedly categorize several known test units (Attribute Gage R&R) –

–

Look for agreement •

each person categorizes the same unit consis tently

•

there is agreement between the operators on each unit

Use disagreements as opportunities to determine and eliminate problems


SETTING UP AN ATTRIBUTE GAGE STUDY • Most important aspect of attribute Gage Study is

selecting parts (representative defects) • Most challenging aspect is choosing parts for the study. Typically use . . . – 50% acceptable parts – 50% defective parts

• Have operators repeatedly classify parts in random order without knowledge of which part they are classifying (blind study)


27

Analysis of Attribute Gage R&R • Stat  Quality Tools Analysis

 Attribute Agreement

– Percent Agreement based on number of Parts – Kappa Statistics (range -1 to 1)

• Minitab Assistant Analysis



Measurement System

– More graphical output – Accuracy statistics based on number of Appraisals – No Kappa statistics 55 | MDT Confidential

Use Minitab Assistant -> Measurement Systems Analysis (MSA)

28

Create Attribute Agreement worksheet

Create Attribute Agreement worksheet

29

Create Result Data • • • • • • • •

Choose Number of Appraisers = 3 Choose Number of Trials = 2 Choose Number of Test Items = 10 Items 1-5 are “Good”; Items 6-10 are “Bad Click “OK” Copy column “Standards” and paste into “Results” Fix column name back to “Results” Find first trial of Item 1 and Item 2 – Change result from “Good” to “Bad” to inject two errors into the simulated study

• Save onto Desktop as “Attribute GRR”

Attribute Agreement Analysis

30

Summary Report Attribute Agreement Analysis for Results Summary Report Is the overall % accuracy acceptable? < 50%

Misclassification Rat es 100%

No

Yes 96.7% The appraisals of the test items correctly matched the standard 96.7% of the time.

100

3.3% 6.7% 0.0% 6.7%

Comments

% Accuracy by Appraiser 120

100.0

Overall error rate Good rated Bad Bad rated Good Mixed ratings (same item rated both ways)

100.0

96.7%

90.0

80

60

40

Consider the following when assessing how the measurement system can be improved: -- Low accuracy rates: Low rates for some appraisers may indicate a need for additional training for those appraisers. Low rates for all appraisers may indicate more systematic problems, such as poor operating definitions, poor training, or incorrect standards. -- High misclassification rates: May indicate that either too many Good items are being rejected, or too many Bad items are being passed on to the consumer (or both). -- High percentage of mixed ratings: May indicate items in the study were borderline cases between Good and Bad, thus very difficult to assess.

Attribute “c=0” result . . . Showing that no bad parts were misclassified as good Overall, 96.7% of presentations were classified correctly

20

0

Appraiser 1

Appraiser 2

Appraiser 3


Accuracy Report Attribute Agreement Analysis for Results Accuracy Report All graphs show 95% confidence intervals for accuracy rates. Intervals that do not overlap are likely to be different.

Illustrates the 95% / 90% result

% by A ppraiser and Standard

% by A ppraiser

Good

Appraiser 1 Appraiser 1 Appraiser 2

Appraiser 3

Appraiser 2 40

60

80

100

% by Standard Appraiser 3

Good Bad

Bad 40

60

80

100 Appraiser 1

% by Trial

1

Appraiser 2

2

Appraiser 3

40

60

80

100

40

60

80

100

31

Kappa 

Kappa is a measure of rater’s agreement.



Minitab:

• •

Reports two Kappa statistics: Fleiss’ & Cohen’s Defaults to Fleiss’ Kappa Minitab will only calculate Cohen’s Kappa if you choose the option for Cohen’s Kappa, and if one of these two conditions is true:



• •

A) Two appraisers perform a single trial on each sample B) One appraiser performs two trials on each sample



Kappa is meant for attribute data.



Kappa ranges from -1 to 1.


Kappa (Landis and Koch)

Acc or di ng to AIA G (Auto in du str y), a gen eral ru le of thu mb i s:  

A Kap pa val ue g reater t han 0.75 ind icates a g oo d to excell ent agreement Kappa values less than 0.40 indicate poor agreement.

This general rule of thumb may not apply for most Medtronic applications. Any disagreement on rejectable units would be of concern. 64 | MDT Confidential

32

Kappa calculations


Kappa results


33

Summary and Recap • Quality Trainer Review • Topics with Variables Data – Gage R&R Sample Size – Probability of Misclassification (Variables Data) – Helpful Hints

• MSA for Destructive Tests • MSA for Attribute Tests


BACKUP SLIDES


34

Destructive Gage R&R - 2 Nested Designs Stage 1 1 Operator

•2 Stage Nested Desig n App ro ach

Parts •Samples are parts that can be subdivided into homogenous sub samples.

Location

1

1 2

•Stage 1: 1 operator measures sub-samples (2-5) from parts (5-10). •Stage 2: 3 operators each measure same location per part (5-10).

2 5

1 2

10

5

1 2

5

Stage 2 1 sub-sample per part Operator Parts

2

1

1 2

10

1 2

3

10

1 2

10


Destructive Gage R&R - 2 Stage Die Bond Example (cont.) •Project:

MINITAB®

Destructive 2 stage nested.mpj

Pull

testing of die bond. Parts are die. Sub-samples are 5 wire locations on the die. Spec = 7.5 grams minimum. Stage1:

1 operator pull tests all 5 wire locations on each of 10 die. Stage

2: Each of 3 operators pull test 10 die at wire location 1. 70 | MDT Confidential

35

Destructive Gage R&R - 2 Stage Die Bond Example (cont.) Stage

1: Stat > ANOVA > Full y Nest ed ANOVA

From worksheet: stage1

2 part

Nested ANOVA: Pull Strength versus Die Var i ance Component s

Sour ce Di e Err or Tot al

Var Comp. 0. 088 0. 479 0. 567

ˆ 

% of Tot al 15. 50 84. 50

St Dev 0. 296 0. 692 0. 753


Destructive Gage R&R - 2 Stage Die Bond Example (cont.) Stage

2: Stat > ANOVA > Full y Nest ed ANOVA

From worksheet: stage2

Nested ANOVA: Pull Strength (Wire 1) versus Operator Var i ance Component s ˆ2 

operator

Sourc e Oper at or Err or Tot al

Var Comp. 0. 053 0. 428 0. 481

% of Tot al 11. 08 88. 92 2 part / repeat

St Dev 0. 231 0. 654 0. 694

ˆ  72 | MDT Confidential

36

Destructive Gage R&R - 2 Stage Die Bond Example (cont.) Manual

calculation of Gage Repeatability and Reproducibility

2 2 ˆ repeat = part / repeat

ˆ

ˆ

2

part

= 0.428 – 0.088 = 0.340 2

R& R

= 0.340 + .053

= 0.393

Compare Gage R&R variance to part variance if parts are chosen to be representative of production process. Since this is a one-sided spec (7.5 grams) use Misclassification to determine gage acceptance. 73 | MDT Confidential

Kappa – Call Center Example 

Call Center workers were asked to categorize types of calls they received: Callcat.mtw MINITAB®


37

Kappa Attribute Analysis: Option Setting


Kappa : Within Appraiser Agreement


38

Kappa: Each Appraiser vs Standard


Kappa for Appraisers 

What do we conclude from this analysis for the raters performance?



What would you do next?



Can this method be applied to the banana data?


39

Distribution Analysis The Art of Finding Useful Models Jeremy Strief, Ph.D. MECC Principal Statistician

Objectives • Explain why distributional analysis is statistically complicated (and sometimes emotionally frustrating!) • Emphasize the importance of engineering theory and historical precedent. • Encourage the use of multiple graphical methods in addition to numerical tests. • Review common causes of Non-Normality. • Discuss Transformations and how they compare to fitting non-Normal distributions.

MedtronicConfidential

1

Recap from Quality Trainer • Normal Distributi on Basics • Capability Analysis (Normal) • Capabilit y Analysis (Non-Normal) • Graphical to ols – Boxplots – Histograms – Individual Value Plots


Distribution Analysis Motivation and Philosophy

2

Why Assess Distribution • Statistical tools vary in sensitivity to and effect of distributional assumptions • Some MDT procedures require distributional assessment for those statistical methods which are highly sensitive to distributional assumptions Statistical Tool

Distributional Sensitivity Effect of Poor Distributional Fit

Capability Analysis Tolerance Intervals Variables Lot Acceptance Sampling Individuals Chart for SPC GLM/ Regression / ANOVA Xbar chart for SPC Two‐sample t‐test Non‐parametric methods

High High High High Med Med/Low Low Low

Incorrect PPM/Ppk Incorrect Bounds Altered rejection and acceptance rates Incorrect control limits approximate p‐value approximate p‐value approximate p‐value approximate p‐value


Not All Data Are Normal: Example Histogram of Time 40

30 y c n e u q e r F

20

Lead Time Data usually have a long tail – skewed distribution

10

0 0

10

20

30 Time

40

50

Proba bility Plot of Time Norm a l 99.9

M ean S t De v N A D P -V alu e

99 95 90 t n e c r e P

12.31 9 .6 56 100 5.7 38 < 0.005

80 70 60 50 40 30 20 10 5 1 0.1 -2 0

-1 0

0

10

20

30

40

50

60

Time


3

Not All Data are Normal: Considerations • Observed data need not follow any tractable mathematical model. • Some mathematical models may be useful, if imperfect, representations of the data.


Frustrations with Distributional Analysis

• Larger sample sizes (n>100) cause the statistical tests to detect small departures from a theoretical model. Such departures may not be practically significant. • Smaller sample sizes (n<15) often yield multiple distributions with p-values greater than 0.05. Graphs may look sparse and thus may not narrow one’s choice of distribution. • Note: for both cases the data needs to come from a process in control.

9 |Medtronic Confidential

4

The Underlying Statistical Hypotheses •

The statistical hypothesis testing is ‘backward,’ in that the null hypothesis assumes that the particular distribution is a good fit. – H0: Distribution specified has a good fit – H1: Distribution specified has lack-of-fit

• •

•

Low p-values will disprove the fit of a distribution. So certain distributions can be ruled out as a reasonable models. Using the standard goodness-of-fit metrics, it is technically not possible to prove that a particular distribution is the “true model” for the data. Instead of providing statistical “proof”, distribution analysis is geared toward assessing which statistical distributions are plausible models for the data at hand.


Philosophy of Distribution Analysis

“All models are approximations. Essentially, all models are wrong, but some are useful. However, the approximate nature of the model must always be borne in mind.” --G.E.P. Box


5

N=15 Probability Plots


N=500 Examples

Only 12 out of 500 values were affected by the truncation or censoring. MedtronicConfidential

6

How to Determine Distribution Priority order

1. Scientific/Engineering Knowledge 2. Historical distribution analysis 3. Distribution analysis

Why is distribution analysis last?

• Sample size (50 to 100) • Regardless of n, key Xs and shift and drift can mask true distribution

Distribution applies to short term data only


Importance of Engineering Theory • The choice of distribution should be both statistically plausible and scientifically justified. • Engineering theory and historical precedents often suggest whether a distribution should be Normal, Lognormal, or Weibull. • If scientific theory does not lead to one single statistical model, at least consider – Whether the distribution should be skewed or symmetric – Which distributions can be ruled out


7

Data Analysis Philosophy • Information shouldn’t be destroyed. Examples of information destruction are – Converting variables data to attribute data. – Heavy rounding with a bad measurement system. – Drifting measurement system.

• Check the quality and structure of the raw data. – Are there physically impossible values, wild outliers, missing values, too many ties? – Are the data paired or unpaired? – Was randomization employed? – How was the data generated? 15 | MDT Confidential

Data Analysis Philosophy • Plot the data AND do analytics. – PLOT histograms, run charts, scatter plots,… . See what is going on. Do a probability plot for process data. – Use ANALYTICS to get quantitative about what you have seen. Examine the residual plots from analytical model fits.

• Analyses are performed on yesterday’s data today to predict tomorrow’s performance. – Data from an unstable process that is analyzed (ignoring the instability) may result in a conclusion that will not hold up tomorrow. 16 | MDT Confidential

8

Distribution Analysis Review of Engineering Distributions

Most Common Statistical Models for Engineering Applications

• Weibull • Exponential (special case of Weibull) • Lognormal • Normal


9

Weibull •

A flexible model which can assume many different shapes, depending on the choice of parameters

•

Scale parameter α or η

•

Shape parameter β

•

Arises from “weakest link” failures, or situations when the underlying process focuses on the minimum or maximum value of independent, positive random variables.

•

Models stress-strength failures


Exponential •

Special case of Weibull when β=1

•

Constant hazard rate, meaning that the probability of failure is not a function of the age of the device/material.

•

May occur when multiple failure modes are operating simultaneously

•

May be useful in modeling software failures resulting from external sources (e.g. cosmic radiation causes bit-flips at an extremely low, constant rate)


10

Lognormal •

Models time-to-failure caused by several forces which combine multiplicatively.

•

Describes time to fracture from fatigue crack growth in metals.

•

Right skewed distribution, useful when data values take multiple orders of magnitude (e.g. 1.4, 14, 140).

•

Two parameters (μ,σ), each of which is traditionally expressed on the log scale.

•

So if X~Lognormal(μ,σ), then ln(X)~Normal(μ,σ)


Normal •

Models time-to-failure caused by additive, independent forces

•

Commonly describes gage error, dimensional measurements from a supplier, and other symmetric, bell-shaped phenomena


11

Additional Models to Consider • Logistic • Smallest Extreme Value (SEV) • Largest Extreme Value (LEV)


Some Relationships • SEV distribution = ln(Weibull distribution). • LEV distribution = ln(1/Weibull distribution). • Normal distribution = ln(Log-normal distribution). • All Weibull distributions can be rescaled and repowered to get another Weibull. • The Weibull(100,4) is very close to a Normal (mean=90.64, s.d= 25.43). This normal is thicker in the tails than the Weibull (100,4). Ref: 02SR013 “Algorithm for Computing Weibull Sample Size for Complete Data”


12

Review: Common Engineering Distributions Weibull

Normal Wearout

Default

Time to stress/strength related failure Measurement error

Infant mortality

Dimensions

Lead Time Time to fatigue related failure

Lognormal 25 | MDT Confidential

Distribution Analysis Statistical Overview

13

Statistical Approach to Distribution Analysis • Both graphical and numerical approaches are needed • P-value is not definitive, given the “backward” nature of hypothesis testing • Visual assessment of the probability plot is crucial • Reasonably large sample sizes (~50) are needed. Consult your local procedures (e.g. DOC000550 within CRDM) for specific rules.


Distribution Analysis Graphical Methods

14

Good Distribution Analysis Should Always Begin With Plots!

• Probability plots • Histograms • Time plots


Probability Plot •

A probability plot is a 2-dimensional plot with specialized (often logarithmic) axes, to facilitate comparison between observed data and a hypothesized distribution.

•

More specifically, a probability plot is a comparison between the observed and theoretical quantiles (i.e. percentiles) for a hypothesized distribution.


15

Probability Plot Interpretation •

If the distribution i s a good fit to the data, the plotted points should fall approximately in a straight line.

•

When interpreting the probability plot, examine both the p-value and the visual fit. – At the tails of the distribution, look whether the points are falling on the conservative side of the fitted line. – Look for major deviations in the pattern of points from a straight line—kinks, ties, curves, jumps, etc. Do not worry if a few points fall outside the confidence bounds. – Fat Pencil Test: Can the observed data values be covered up by a “fat pencil”?


Probability Plot in Minitab


16

Probability Plot Examples Right skew and curvature:

Large N makes for obvious curvature:


Probability Plot Examples “Subtle Patterns” can be caused by randomness

Both datasets were sampled directly from a Normal distribution. MedtronicConfidential

17

Probability Plot Examples •

Distribution does not pass the Anderson-Darling test, but the lower tail of the distribution falls on the conservative side of the fitted line.

•

Distribution appears to have a lower limit of zero

•

It would be conservative to use the Normal model to estimate the lower tail behavior.


Histograms in Minitab The graph menu offers a histogram platform, but the graphical summary platform offers more information with fewer clicks.


18

Histograms •

More intuitive than probability plots, since the x-y axes are not transformed.

•

Not informative with small sample sizes (<30)

•

Can theoretically be misleading if the bin width is calculated inappropriately, but in practice the histogram is a useful tool for moderate-to-large sample sizes Apparent right skew

Approximately Bell-Shaped


Time Plots •

Fitting a single distribution to your data implies that the underlying process is stable.

•

Without a stable process, distributional fit is irrelevant.

•

Time plots and control charts help evaluate the stability of your process.


19

Why is Stability needed to Assess Distribution? MINITAB®

Distribution Analysis Shift and Drift.mtw

Distribution Assessment Risks •

Shift and Drift, and Variation in Key Xs masks distribution

•

Initial capability data always contains Shift and Drift

•

At Final Capability, process is stable and variation in Key Xs is removed

100 samples from Week 1 25 samples from Week 2 100 samples from Week 3

Distribution applies to short term data only


Initial Process Data often have Shift and Drift I Chart of Initial Capability Data 1

35

1 1 1 11 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 11 1 1 1 1 1 1 111 1 1111 111 1 111111 1 1 11 11 11 11 1 1 1 1 1 1 11 1 1 11 1 11 1 1 11 1 1 1 1 1 11 1 1 1

1

30 e u l a V l a u d i v i d n I

25

_ X=19.93

20 15 10 5

UCL=26.30

LCL=13.55

1 11 1 1 11 1 1 1 11 1 1 1 1 1 1 111 1 11 1 1 1 1 1 1 1 1 1111 111 111 1 1 1 1 1 1 11111 11 1 11 11 1 1 1 1 1 11 1 1 11111 1 1 11 1 11 11 1 11 1 1 11 1 1 11 1 1

1

23

45

67

89 111 133 Observation

155

177

199

221


20

Long Term Data May not be Normal

Probability Plot of Initial Capability Data Normal - 95% CI

99.9 99

Combined Data is not normal

95 90 t n e c r e P

80 70 60 50 40 30 20

Mean S tD ev N AD P-Value

10 5

19.93 9 .6 79 225 13.617 <0.005

1 0.1

-20

-10

0

10 20 30 Initial Capability Dat a

40

50

60


But Short Term Data Could be Normal

Probability Plot of Initial Capability Data Normal - 95% CI

99.9 99 95 90 t n e c r e P

80 70 60 50 40 30 20

Week 1 2 3

Mean 9.871 20.39 29.87

10 5 1

Each week is normal

StDev N AD P 2.155 100 0.476 0.233 2.203 25 0.280 0.616 2.011 100 0.236 0.785

0.1

0

10

20 30 Initial Capability Dat a

40


21

Distribution Analysis Numerical Methods

Numerical Methods • For all numerical methods: – A large (≥0.05) p-value implies there is no evidence against the hypothesized distribution. – A small (<0.05) p-value implies there is statistically significant lack-of-fit.

• It is commonly stated that a distributional test “passes” when p ≥0.05. • A passing test does NOT mean that the hypothesized distribution is “correct” or “best.” There may be multiple models which fit the data, and you should choose whichever model best matches science and historical precedent.


22

Most Common Normality Tests • An der son-Darlin g (AD) test • Ryan-Join er test Note: The Ryan-Joiner test is essentially equivalent to the Shapiro-Wilk test.


Anderson-Darling • Default approach in Minitab. • May be used to assess fit of Normal and nonNormal distributions. • Gives unreliable results when data are discretized/grouped, which is fairly common when measurement system resolution is poor.


23

Anderson-Darling in Minitab For assessing Normality:


Anderson-Darling in Minitab For any/all distributions:


24

Anderson-Darling Results Normal(10,1.5)

Normal(10,1.5)--Rounded


Ryan-Joiner •

Useful for discretized, rounded, or clumpy data

•

Will not declare significant lack-of-fit simply due to poor measurement resolution

•

Recommended minimum of 5 “groups” to have a meaningful pvalue. Fewer groups may yield an overly optimistic (high) pvalue.

Anderson-Darling

Ryan-Joiner


25

Ryan-Joiner in Minitab


Truncation • The Normal distribution may be used to model tail behavior if it provides a conservative estimate of those tails. • This situation arises when data are truncated, which is quantitatively captured as negative kurtosis.


26

Truncation • In principle, truncated data may be evaluated graphically or through a Skewness-Kurtosis (SK) test. • The SK test checks whether the tails of the Normal distribution are longer or shorter than the tails of your data. • MECC has created and validated an Excel spreadsheet (R134997) which executes the SK test. • In practice, consult your local procedures to ensure your analysis of truncated data is compliant.

Microsoft Excel Worksheet 53 | MDT Confidential

Avoiding Parametric Distributions Altogether • Chebyshev’s inequality captures the tail behavior of any statistical distribution with a finite variance. – For any random variable X and constant k > 1, P( |X- μ | ≥ k σ ) ≤ 1/k 2

• This inequality may be useful for skipping the issue of distributional fit altogether, especially if distributional fit is being assessed in order to compute a tolerance interval. • Chebyshev’s will only be helpful if the process capability is extremely high. • Consult your own procedures for details, but CRDM procedures invoke the following version of Chebyshev: – If the nearest specification is at least 10 standard deviations away from the mean, it may be inferred by Chebyshev that at least 99% of the distribution will fall within specification.


27

Why Normality Tests Fail 1. A shift occurred in the middle of the data 2. Multiple sources or multiple failure modes with different distributions 3. Outliers 4. Piled up data. 5. Truncated data (sorted before you get it) 6. The underlying distribution is not normal (skewed) 7. Poor measurement resolution 8. Too much data (over powered to detect nonnormality) 9. Due to random chance –you expect the test to fail 5% of the time (i.e. 95% confidence) if the data were truly from a normal distribution.

Resolving Non-Normality 1

Data shift

Sublot Skewness/kurtosis test Attribute sampling

2

Multiple data sources

Sublot Skewness/kurtosis test Attribute sampling

3

Outliers

Attribute sampling Outlier removal (May remove outliers only if they constitute typos or data collection errors.)

4/5

Censored/Truncated data

Skewness/kurtosis test

(tails lost)

Conservative fitting Attribute sampling

6

Distribution not normal

Non‐normal analysis Transformation Attribute Sampling

7

Poor measurement resolution

Ryan‐Joiner

8

Too much data

Graphical evidence

9

Random Chance

Skewness/kurtosis test Randomsubsampling Historical assessment

28

When Multiple Distributions Fit Prior engineering knowledge is particularly useful when multiple distributions yield p-values above 0.05: – Picking the distribution solely based on best p-value or best R2 is rational when there is absolutely no history or scientific theory. – A better approach is to assemble a list of plausible (p>0.05) distributions and then make a final choice based upon history and science. – P-values will sometimes be below 0.05 simply as a result of chance (Type I error). It is not recommended to immediately change years of analysis based on one significant p-value. Investigate and monitor before changing distributions.


Avoid the daily special – Do NOT take the “distribution du jour” approach, in which multiple distributions are chosen for a single process. This reflects either: • An out-of-control process, which can’t be captured by a single distribution anyway. • The bad statistical practice of just defaulting to the distribution with the highest p-value.


29

Example: Capability for Non-Normal Data using Tribal Knowledge for Distribution MINITAB®

LoanApplicationTime.MTW

Problem Statement: Time (in days) to process (reject/accept) loan applications is too long causing loss in customer applications Project Goal: Decrease potential customer loss from 15% to 5%. Customer expectation is 20 days. Project Strategy: Path Y = Time Task: Determine capability for Y = Time

Assume lead time has a LogNormal Distribution


Verify Lognormal Distribution

Probability Plot of Time Lognormal - 95% CI 99.9 Loc S c a le N AD P-Value

99 95

2.269 0 .6 84 5 100 0.432 0.299

90 t n e c r e P

80 70 60 50 40 30 20

Probability Plot of Time Lognormal - 95% CI

10 5

99.9

1

99

Lo c S c a le N A D P -Value

95

0.1

90

1 t n e c r e P

10 Time

80 70 60 50 40 30 20

2.269 0 .6 84 5 100 0.4 32 0.299

100

10 5 1

Check if LogNormal provides a good fit

0.1

1

10 Time

10 0


30

Capability for Non-Normal Data using LogNormal

Process Capability of Time Calculations Based on Lognormal Distribution Model USL Process Data LSL * Target * USL 20 Sample Mean 12.31 S a mp le N 1 00 Location 2.26918 S ca le 0 .6 84 49 3

Ov erall Capability Z.Bench 1.06 Z.LSL * Z .U S L 0 .4 7 P pk 0.16 Exp. Overall Performance P PM < LS L * PPM > USL 144242 PPM Total 144242

Observed Performance P PM < LS L * PPM > USL 160000 PPM Total 160000

0

10

20

30

40

50


Distribution Analysis Transformations

31

Two Options • When a dataset is non-Normal, it is acceptable either to – Mathematically transform the data to achieve Normality – Fit a non-Normal distribution

• Transformation carries the practical advantage that many statistical methods are based upon Normality, so there will be more analytical tools available for the transformed dataset. • Transformation carries the disadvantages of creating unnatural units (e.g. log-meters instead of meters) and altering potentially relevant structures of the data. • Note: Please do NOT try transform ations of data from an unstable process, or bimodal data (two bu mps).


Transformation Advice • If a transformation is chosen, it should be as simple as possible, and it should ideally have a physical interpretation. • A log transformation is particularly desirable, since it – Is monotonic – Is straightforward to interpret (it turns multiplicative effects into additive effects) – Is equivalent to the LogNormal distribution – Is common in the literature


32

Transformation Advice • The Johnson transformation is a last resort, as it – Rarely has any scientific/engineering meaning – Involves a complicated mathematical structure – Is not universally considered an “acceptable” transformation – Any Box-Cox transformation with a lambda value between [-2,2] is typically acceptable, although the chosen lambda should ideally have a physical meaning.


Transformation Advice •

There is no transformation which will eliminate outliers! – By definition, an outlier is so far away fr om the rest of the data values that it is unlikely to belong to the same distribution. – An attribute approach is typically needed when outliers are present. – Investigate the outlier and determine if there were any typos or other unusual circumstances which would warrant deletion. – Outliers should NOT be deleted unless there is a strong argument as to why the outlier is not representative of the process. – An apparent outlier could possibly be a “typical” datapoint from a highly skewed distribution, like LogNormal or LEV. – Use engineering thinking as well as statistical thinking to decide the best course of action for outlier mitigation.

•

Stay consis tent in your choic e of transformation. Inconsistency implies an unstable process/distribution.


33

Box-Cox Transformation s

(when there is no th eoretical distribution)

 Assumptions

for Y

• Y > 0; Y is skewed (right or left) • Y is unimodal (single peak) Box-Cox

determines transform to make Y

normal • Y() = (Y  -1) /



for   0

= loge(Y) for  = 0 Use Box-Cox when there is no theoretical distribution


Box-Cox Transformatio ns

(when there is no th eoretical distrib ution)

Typical

Box-Cox transformations

  

2  Y2 transformation

  

0.5

  

0



   

0.5

   

1



sqrt(Y) transformation

logeY transformation 



1 / sqrt(Y) transformation

1 / Y transformation

Use Box-Cox when there is no theoretical distribution


34

Example: Capability for Non-Normal Data using Box-Cox MINITAB®

Error Resolu tion Time.MTW

Problem Statement: Time (in days) to resolve errors in case report forms for a pre-market clinical evaluation is too long causing delay in the product release Project Goal: Decrease error resolution time. Expectation is 7 days. Project Strategy: Path Y = Resolution Time Task: Determine capability for Y = Resolution Time


Example: Verify LogNormal

Probability Plot of Resolution Time Lognormal - 95% CI

Fails • 3 second rule • Fat pencil test • p-value

99.9

Loc S cale N AD P-Value

99 95

1.760 1. 303 200 3.623 <0.005

90 t n e c r e P

80 70 60 50 40 30 20 10 5 1 0.1

0.01

0.10

1.00 10.00 Resolution Time

100.00

1000.00

Not LogNormal! 70 | MDT Confidential

35

Example: Apply Box-Cox Transformation


Example: Determine Optimal Lambda Box-Cox Plot of Resolution Time L ow er C L

U pp er C L Lambda

50

(using 95.0% confidence)

40

E stimate

0.26

Low er C L Upp er C L

0.15 0.38

Rounded Value

0.26

v e 30 D t S

Box-Cox transformation of Y.

20

Default = rounded value 10 Limit -1

0

1 Lambda

2

3


36

Example: Calculate Capability for Non-Normal Data Using Box-Cox


Example: Capability for Box-Cox Transformed Y Process Capability of Resolution Time Using Box-Cox Tra nsformation With Lambda = 0.26 USL*

transformed data

Process Data LSL * T arget * USL 7 S a m p le M e a n 1 0 .2 92 8 S am ple N 200 StDev (Within) 9.25009 StDev (O v erall) 9.5492

Within O v erall Pote ntial (Within) C apability Z.Bench -0.01 Z.LS L * Z .U S L - 0. 01 C pk -0.00 C C pk - 0. 00

A fter T ransfo rmat ion LS L* Target* U SL* Sample Mean* StDev (Within )* StD ev (O v erall)*

O v erall C apability

* * 1.65972 1.66391 0.503383 0.485671

Z.Bench Z.LS L Z .U S L Pp k C pm

0. 4 O bserv ed P erformance P P M < LS L * PPM > USL 520000.00 PPM Total 520000.00

Exp. PPM PPM PPM

0. 8

Within Performance < LS L* * > US L* 503324.68 T o t al 5 03 3 24 . 68

1. 2

1. 6

2. 0

2. 4

-0.01 * - 0. 01 -0.00 *

2. 8

Exp. O v erall Perform ance P P M < LS L* * PPM > US L* 503445.93 P P M T o t al 5 03 44 5. 9 3

Capability = Z.Bench (Potential) 74 | MDT Confidential

37

A Desirable Problem • •

If your data could be handled either through a transformation or a non-Normal distribution, either path is acceptable. All else being equal, a recommended prioritization is as follows: 1. 2. 3. 4.

Log Transformation (= LogNormal model) Weibull/Exponential model Box-Cox with lambda≠0 but lambda within [-2, 2] Other engineering distribution (SEV/LEV, logistic, etc.)

•

Any prioritization scheme should be interpreted as a heuristic, not as the one true path. • The most important thing i s to plot your data and arrive at a mathematical solution w hich makes sense within the engineering/scientific context at hand. • As mu ch as p os si bl e, remain co ns is ten t in yo ur ch oi ce o f statistical method. Avoid the “ distribution du jour” o r “ transformation du jour.”


Distribution Analysis Flowchart

38

Normality Testing Flowchart: CRDM •

CRDM: Meant as a teaching aid, not an official quality doc.


Normality Testing Flowchart: CRDM


39





40

Distribution Analysis Summary and Challenge Problem

Objectives Recap • Explain why di stributional analysis is statistically compl icated (and sometimes emotio nally frustrating!) • Emphasize the impor tance of engineering theory and histori cal precedent. • Encourage the use of multi ple graphical methods in addition to numerical tests. • Review com mon causes of Non-Normality • Discuss Transfor mations and how they compare to fitti ng non-Normal distributions


41

Distribution Analysis Commentary • Distribution fitting is NOT about finding the true distribution for your data; statistical theory CANNOT prove that a particular distribution is the true model for the data. – A model is “true” if it still fits when the sample sizes approaches infinity. – With engineering data, it is often the case that distributions are approximately Normal when N=50, but taking N=200 or N=500 will show small—but statistically significant—departures from Normality. – In such a situation, the Normal distribution is often still a useful model even if it is not a “true model.”

• Instead of providing scientific truth, distribution analysis is geared toward assessing which statistical distributions are plausible models for the data at hand.

Distribution Analysis Commentary •

Good distribution fitting should combine statistical analysis with engineering/scientific thinking.

•

Even before any data are collected, engineering theory and historical precedents often suggest a distributional form: – Does the process involve any sort of maximization or minimization of physical forces? If so, then Weibull might be a good model. – Does the process involve the averaging of multiple small forces? If so, the Normal might be a good model. – Are there historical precedents which suggest which model is best?

•

Ideally, the chosen distribution should have an insignificant pvalue AND it should intuitively match with engineering principles.

42

Don’t Forget Business Context • Usually distribution analysis is just one step in a larger analytical problem. • Keep the larger business/engineering problem in mind, as it may suggest – Whether only one tail of the dataset needs to be modeled. – Whether a single low p-value might be a statistical false alarm. – Whether the model needs to produce highly precise numbers or just be “in the ballpark.”


Challenge Problem • MECC Supplier Dataset: mecc_supplier.mtw • Business goal is to qualify the supplier as having high capability, and possibly to create a variables or attribute acceptance sampling plan. • LSL: 0.058 • USL: 0.064 • Analyze the data and offer your opinion of what distribution is best for the situation at hand. • What questions would you ask the Supplier Quality Engineer to help refine your decision? 86 | MDT Confidential

43

Process Capability Analysis

Objectives

• QT Review • Process Capability


1

Recap from Quality Trainer

• • • • •

Introduction Process Capability for Normal Data Capability Indices Process Capability for Non-Normal Data Summary


A5 Process Capability Measuring Process Capabili ty Sigma Scale, Z scores, DPM=PPM • Process Capability Indices • (Cp, Cpk, Pp, Ppk) • Impact of Normality & Process Stabili ty At tr ibut e Data Non-normal Data Minitab Assistant Impact of Sample Size (Confidence Limi ts) Comparison to Tolerance Intervals Impact of Measurement Error 4 | MDT Confidential

2

SIX SIGMA QUALITY LEVEL Customer Requirement

Histogram of Process Output 0.09

Process Capability: Comparison between what the process produ ces vs. what is required

130

0.08

Z = 6.0

0.07

Mean = 100

0.06

Std Dev = 5

Defect Rate:

6

y t i 0.05 s n e D 0.04

1 part per billion

0.03

NOTE: 2 parts per billion for two-sided specs

0.02 0.01 0.00 80

90

100


110

120

130

140

X

SIX SIGMA QUALITY LEVEL Histogram of Process Output 0.09

Customer Requirement 130

To Esti mate Lon g-Term Perform ance, Ap pl y a 1.5 SHIFT IN MEAN

0.08 0.07

Mean = 107.5

0.06

Std Dev = 5

Z = 4.5

4.5

Defect Rate:

y t i 0.05 s n e D 0.04

3.4 parts per million

0.03 0.02 0.01 0.00 80 6 | MDT Confidential

90

100

110

120

130

140

X

3

SIGMA SCALE

Short-Term

Process Sigma 6.0 5.5 5.0 4.5 4.0 3.5 3.0 2.5 2.0 1.5 1.0 0.5 0.0

Long-Term


z 4.5 4.0 3.5 3.0 2.5 2.0 1.5 1.0 0.5 0.0 -0.5 -1.0 -1.5

Standard Normal Tail Area Probabili ty

P(Z>z) 0.0000034 0.0000317 0.0002327 0.0013500 0.0062097 0.0227501 0.0668072 0.1586553 0.3085375 0.5000000 0.6914625 0.8413447 0.9331928

DPMO 3.4 32 233 1,350 6,210 22,750 66,807 158,655 308,538 500,000 691,462 841,345 933,193

% Conforming 99.99966 99.99683 99.9767 99.865 99.379 97.72 93.32 84.13 69.15 50.00 30.85 15.87 6.68

Process Capability Indices

Cp, Pp

Cpk, Ppk

Process Capabilit y Ratio: Variation o nly, ignores m ean

Includes mean to account for centering

Cp, Cpk use withi n subgroup variation estimate of (Short-Term, Potent ial) Pp, Ppk use overall sampl e standard deviation esti mate (Long-Term, Actual)


4

Process Capabili Capabili ty Ratio Ratio Spec Width: US USL L – LSL defines allowable variation 6 defines actual process variation

Process Fallout and the process capability ratio (PCR).

9

9 | MDT Confidential Confidential

Within Subgroup Unbiasing Constants


5

Simul ating Withi n Subgrou p Varia Variation tion EXERCISE

1. Randomly Sample Sample from a Known Known Population: Normal (=100, =5) Using Calc > Random Data > Normal Simulate 10,000 Subgroups of Size n=5 by placing data into C1-C5 2. Compute the the Mean (X-bar), Range Range (R), and StDev (S) for each subgroup Using Calc > Row Statistics – place into C6, C7, C8 3. Compute the Variance (S2), R/d2 and S/c4 (for n=5, d2=2.326, c4=0.939986) Using Calc > Calculator Calculator – place into C9, C10, C11 4. Evaluate the performance performance of the 6 statistics in columns C6-C11 C6-C11 in estimating population parameters Using Stat > Basic Statistics > Display Descript ive Statistics or Stat > Basic Basic Statisti cs > Graphical Summary 5. Which statistics statistics are biased and which ones are unbiased? unbiased? 11 | MDT Confidential Confidential

NOTE: Averag e (R/d2) = (R-b ar/d ar/d2) 2)

Estim ating Within Subgroup StD StDev ev

Data Source:

20 subgroup samples of 5 parts taken from a component manufacturing process. Data are coded x 0.0001 in. + 0.50 in. Applied Statistics & Probability for Engineers, 6th Edition (Montgomery &

Runger, Wiley 2013) LSL = 25 USL = 45 Target = 35


6

Xbar-S Chart of Com Component ponent Data n=5 per subgroup 1

37.5 UCL=36.82

n a e 35.0 M e l p 32.5 m a S

_ _ X=33.32

30.0

LCL=29.82

1 1

1

3

5

7

9

6.0

11 Sample

13

15

17

19

1

UCL=5.116

v 4.5 e D t S e 3.0 l p m a S 1.5

_ S=2.449

0.0

LCL=0 1

3


5

7

9

11 Sample

13

15

17

19

NOTE: Cente NOTE: Centerr line on S chart is calculated backward from Unbiased Pooled StDev: StDev: 2.605*0.9 2.60 5*0.939986 39986 = 2.449 2.449 for subg rou p si ze n=5

Stat > Qualit Qualit y Tool Tool s > Capabil Capabil ity An alysi alysis s > Normal . . . Process Capability of Component Data n=5 per subgroup LSL

Target

USL

Process Data LS L 25 T arget 35 U SL 45 S a mp mp le M ea ea n 3 3. 3. 32 32 S am ple N 100 StDev(Wit StDev(Withi hin) n) 2.605 2.60524 24 StDev(Ov erall) l ) 3.29946 3.29946

Within Overall Potential (Within) C apability Cp 1. 28 C PL PL 1.06 C PU P U 1. 49 C pk pk 1.06 Ov erall Capability Pp PPL P PU PU P pk C pm pm

27 O bserved Performance Performance P P M < LS LS L 0. 00 00 P P M > U SL SL 0 .0 .0 0 P P M T ot ota l 0. 00 00

30

Exp. Within Performance P P M < LS LS L 7 02 02 .6 .6 5 P PM PM > U SL SL 3. 68 P P M T ot ot al al 7 06 06 .3 .3 2

33

36

39

Exp. Ov erall Performance Performance P P M < LS LS L 5 84 84 0. 0. 77 77 P P M > U S L 200. 09 09 P P M T ot ot al al 6 04 04 0. 0. 85 85

42

1 . 01 0. 84 1. 18 0. 84 0. 90

45

PPM Estimates Ass ume PPM a Stable Process That Is Normally Distributed


7

Stat > Basic Basic Statist ics > Norm Norm ality Test . . . Probability Plot of Component Data (Subgroup Data Stacked) Normal 99.9 Mean S tDev N AD P -V -V al al ue ue

99 95

33.32 3.299 100 0.620 0.620 0. 10 104

90

t n e c r e P

80 70 60 50 40 30 20 10 5 1 0.1

25

30

35

40

45

Data 15 | MDT Confidential Confidential

Statt > Quality Sta Quality Tools > Capa Capabili bili ty Sixpack > Normal . . . Process Capability Sixpack of Component Data Xbar Chart

Capability Histogram

1

LSL

n a 36 e M e l p 32 m a S

Target

USL

UCL=36.82

Specifications LS L 25 Ta arrget 35 USL 45

_ _ X=33.32 LCL=29.82

1

28 1

3

5

7

9

11

13

15

17

1

19

27

30

R Chart 16

33

36

39

42

45

Normal Prob Plot A D: 0.620, P: 0.1 04

1

e g n a R 8 e l p m a S

UCL=12.81 _ R=6.06

0

LCL=0 1

3

5

7

9

11

13

15

17

19

20

Last 20 Subgroups Subgroups

30

Within S tD ev ev 2 .6 .6 05 05 Cp 1 .28 Cpk 1 .06 PPM 7 06.32

40

s e u l a V 32 24 5


10 Sample

40

Capability Plot

15

20

Within Overall

Overall S tD ev ev 3 .2 .2 99 99 Pp 1.0 1 Ppk 0.8 4 Cpm 0.9 0 PPM 60 40 40.85

Specs

NOTE: Cente NOTE: Centerr line on R chart is calculated backward from Unbiased Pooled StDev: StDev: 2.605*2.3 2.60 5*2.326 26 = 6.06 6.06 for s ubg rou p si ze n=5

8

Stat > Quality Tool Tool s > Ca Capabil pabil ity > Between/Withi Between/Withi n . . . Between/Within Capability of Component Component Data n=5 per subgroup LSL

Target

USL B/W Overall

Process Data LS L 25 T arget 35 USL 45 S am am pl ple M ea ean 3 3. 32 S am ple N 1 00 StDev(Between) StDev(Between) 2.37001 2.37001 S tD tD ev ev ( Wi Wi th th in) 2 .6 .6 05 05 24 24 S tD tD ev ev (B (B /W) 3.52197 S t D ev ev ( O v e ra ra llll ) 3 .2 .2 99 99 46 46

B/W C apability apability Cp 0 . 95 C PL PL 0.79 C PU P U 1 . 11 C pk pk 0.79 Ov erall Ca pability pability Pp PPL P PU PU P pk C pm pm

27 Observed P erformance erformance P P M < LS L 0. 00 00 P P M > US US L 0 .0 .0 0 P P M T ot ota l 0. 00 00


Exp. B/W Performance P P M < LS L 9 08 08 0. 0. 54 54 P P M > U S L 4 56 56. 04 04 P P M To To ta ta l 9 53 53 6. 6. 58 58

30

33

36

39

42

1 . 01 0 . 84 1 . 18 0 . 84 0 . 90

45

Exp. Ov erall Performance Performance P P M < LS L 5 84 84 0. 0. 77 77 P P M > U S L 2 00 00. 09 09 P P M To To ta ta l 6 04 04 0. 0. 85 85

NOTE: Use Between/With in analysis when there NOTE: is significant variation between subgroups

Estim ating Within StDe tDev v f rom Indivi dual Measurements Measurements Data Source:

20 samples of individual measurements of concentration taken at one-hour intervals from a chemical process. Applied Statistics & Probability for Engineers, 6th Edition (Montgomery &

Runger, Wiley 2013) LSL = 95 USL = 105 Target = 100


9

I-MR Chart of Concentration 108 UC L=105.98

e u 104 l a V l a 100 u d i v i d 96 n I

_ X=99.10

LCL=92.21

92 1

3

5

7

9

11 Observation

13

15

17

19

UC L=8.461

8

e g 6 n a R g 4 n i v o M2

__ MR=2.589

0

LCL=0 1

3

5

7

9

11 Observation

13

15

17

19


Stat > Qualit y Tool s > Capabil ity An alysis > Normal . . . Process Capability of Concentration LSL

Target

USL

Process Data LS L 95 Target 100 USL 105 S a mp le M e an 9 9. 09 5 S ample N 20 S t D ev ( Wi th in ) 2 .2 95 63 StDev(Ov erall) 1.97603

Within Overall Pote ntial (Within) C apability Cp 0.73 C PL 0.59 C PU 0.86 C pk 0.59 Ov erall C apability Pp PPL P PU P pk C pm

94 Observ ed Performance P P M < LS L 5 00 00 .0 0 PPM > USL 0.00 P P M T ota l 50 00 0.0 0

96

Exp. Within Performance P P M < LS L 3 722 6. 30 P PM > U SL 5051.62 P P M T ota l 42277 .92

98

100

102

Exp. Ov erall Performance P P M < LS L 1 91 17. 21 P P M > US L 1402.63 P P M Tota l 2051 9. 84

0.84 0.69 1.00 0.69 0.76

104

PPM Estimates Assume a Stable Process That Is Normally Distributed


10

Stat > Basic Statist ics > Norm ality Test . . . Probability Plot of Concentration Normal 99 Mean StDev N AD P-Value

95 90

99.10 1.976 20 0.398 0.333

80

t 70 n 60 e c r 50 e 40 P 30 20 10 5

1

95.0

97.5

100.0 Concentration

102.5

105.0


Stat > Quality Tools > Capabili ty Sixpack > Normal . . . Process Capability Sixpack of Concentration I Chart

Capability Histogram LSL

UCL=105.98

e 105 u l a V l 100 a u d i v i d 95 n I

Target

USL

Specifications LSL 95 T arg et 1 00 U SL 105

_ X=99.10

LCL=92.21 1

3

5

7

9

11

13

15

17

19

94

96

98

Moving Range Chart

100

102

104

Normal Pr ob Plot A D: 0.398, P: 0.333

UCL=8.461

8

e g n a R g 4 n i v o M

__ MR=2.589

0

LCL=0 1

3

5

7

9

11

13

15

17

19

95

Last 20 Observations s 100.0 e u l a V 97.5 95.0 10 Observation

15

105

Capability Plot Within S tD ev 2 .2 96 Cp 0.73 Cpk 0.59 PPM 42277 .92

5

100

20

Within

Overall

Overall S tD ev 1 .9 76 Pp 0.84 Ppk 0.69 Cpm 0.76 P PM 2 05 19 .84

Specs


11

Compare Proc ess Capabili ty Indic es to Diagnose Impro vement Ac tions Potential Capability (Inherent Variation)

Cp

Disparity Indicates Centering Issue

Disparity Indicates Stability Issue

Disparity Indicates Stability Issue

Pp

Cpk

Disparity Indicates Centering Issue

Ppk Overall Performance


FOUR POSSIBILITIES (Donald J. Wheeler) Control Charts (LCL, UCL)

Is Process In Statistical Control? Yes Yes Is Process Capable of Meeting Requirements? Process Capability Indices No (Cp, Cpk, Pp, Ppk) Requires L SL, USL

No

Ideal State (Monitor)

Brink of Chaos (Remove Special Causes)

Threshold State (Alter System)

State of Chaos


12

Centered, Stable, Capable Time Series Plot of A 115 110

110

105

A 100 95 90

90 1

10

20

30

40

50 Index

60

70

80

90

100

Process Capability of A LSL

USL Within Overall

Process Data LSL

90

T arget

*

USL

110

Sa mp e l M ea n

9 9. 80 45

Sample N S tD ev (W it hi n)

100 1 .4 85 39

StDev(Overall)

1.45512

Potential (Within) Capability Cp

2.24

CPL

2.20

CPU

2.29

Cpk

2.20

Overall Capability

90 Observed Performance PPM < LSL

93

Exp. Within Performance

0.00 PPM < LSL 25 | MDT Confidential

96

99

102

105

108

Exp. Overall Performance

0. 00

PPM < LSL

0. 00

PPM > U SL

0. 00

PPM > USL

0. 00

PPM > U SL

0. 00

PPM T otal

0.00

PPM T otal

0. 00

PPM Total

0. 00

Pp

2.29

PPL PPU

2.25 2.34

Ppk

2.25

Cpm

*

25

Not Centered, Stable, Potentially Capable Time Series Plot of B 115 110

B

110

105 100 95 90

90 1

10

20

30

40

50 Index

60

70

80

90

100

Process Capability of B LSL

USL Within Overall

Process Data LSL

90

Target

*

USL

110

Sa mp le M ean

1 06 .8 14

Sample N St De v( Wi th n i )

100 1 .3 60 89

StDev(Overall )

1.4326


2.45

CPL

4.12

CPU

0.78

Cpk

0.78

Overall Capability

90 Observed Performance

93


96

99

0.00

PPM < LSL

0.00

9616. 16

PPM > U SL

13080. 51

PPM T ot al

9616. 16

PPM T otal

13080. 51

PPM T otal

105

108

111


PPM < LSL 26 | MDT0.00 PPM < LSL Confidential PPM > USL 200 00. 00 PPM > U SL 200 00. 00

102

Pp

2.33

PPL PPU

3.91 0.74

Ppk

0.74

Cpm

*

26

13

Centered, Stable, Not Capable Time Series Plot of C 110

110

105

C 100 95 90

90 1

10

20

30

40

50 Index

60

70

80

90

100

Process Capability of C LSL

USL Within Overall

Process Data LSL

90

T arget

*

USL

110

Sa mp e l Me an

1 00. 30 9


100 4 .7 37 33

StDev(Overall)

4.66247


0.70

CPL

0.73

CPU

0.68

Cpk

0.68

Overall Capability

90 Observed Performance PPM < LSL


00 PPM < LSL 27 | 20000. MDT Confidential

95

100

105

110


147 72. 82

PPM < LSL

13515. 67

PPM > U SL

30000. 00

PPM > U SL

203 94. 82

P PM > U SL

1 88 31 .51

PPM T otal

50000. 00

PPM T otal

35167. 63

PPM T ot al

32347. 18

Pp

0.71

PPL PPU

0.74 0.69

Ppk

0.69

Cpm

*

27

Centered, Unstable, Potentiall y Capable Time Series Plot of D 115 110

110

105

D 100 95 90

90 1

10

20

30

40

50 Index

60

70

80

90

100

Process Capability of D LSL

USL Within Overall

Process Data LSL

90

T arget

*

USL

110

Sa mp e l Me an

1 00. 10 6


100 1 .2 30 73

StDev(Overall)

3.78355


2.71

CPL

2.74

CPU

2.68

Cpk

2.68

Overall Capability

90 Observed Performance

93


96

99

0.00

PPM < LSL

3782. 21

0. 00

PPM > U SL

4459. 79

PPM Total

0.00

PPM T otal

8242. 00

PPM T otal

105

108


PPM < LSL 28 | MDT 0. 00 Confidential PPM < LSL PPM > U SL 0. 00 PPM > USL 0.00

102

Pp

0.88

PPL PPU

0.89 0.87

Ppk

0.87

Cpm

*

28

14

Critical Thinking o f Data & Analysis is Required for Valid Inferences

DATA

+

CONDITIONS How was it collected? At a s in gl e po in t i n time, or over multiple time points? Were all sources of variation acting during the data collection timeframe?

ANALYSIS STATISTICS Control Charts: Is variation stable over time?

INFERENCE PREDICTION Future Performance

Process Capabilit y Indices: Cp, Cpk Pp, Ppk Within Overall Short-Term Long-Term

QUALIFICATION STUDY: Data collected at a single time point over limited

conditions. Therefore, control charts and Ppk may not reflect long-term performance since the analysis was computed from a short-term data set. Recommend using study sample size as the subgroup size for capability analysis. 29 | MDT Confidential

How to Evaluate Proc ess Capabili ty Stabili ty: Compare Cpk to Ppk or Cp to Pp Centering: Compare Cp to Cpk or Pp to Ppk Variatio n: Compare Cp, Pp to 1.0 World Class Perfo rmance: Cpk > 2, Ppk > 1.5

How to Improve Process Capability (1) Make Process Stable (2) Center Proc ess Mean (3) Reduce Process Variatio n (4)* Widen Specification Limi ts * What is required for opt ion #4? What quality system requi rement exists to assure that option #4 is done w ith scientifically sound r ationale? 30 | MDT Confidential

15

Quality Improvement Process


Attribute Data WHAT IS Z .

W h a t is Z ?

T e lls h o w

c a p a b l e Y i s r e l a t iv e t o s p e c s Z

D P M O

6

3 .4

5

2 33

4

6,210

3

66,807

2

308,537

1

691,462

DPMO = defects per million opportun ities Opportunities = Number of Units* Opportuniti es per Unit ( to have a defect) Defects = number of observed defects in the Number of Units Att ri but e Capabi lit y Measu res 1)A defect rate or a defective rate (they are the same if there is only one opportunity per uni t for a defect - in this case a defective unit has only one defect ) 2)DPMO 3)Z


16

Attribute Data ROADMAP FOR CAPABILITY .

Capability Roadmap What Type of Data Do You Have ? At tr ib ut e Dat a

Variables Data

MINITAB: Stat > Quality Tools > Capability Analysis > Normal

Z.st

Z Bench Potential (Within)

For Attribu te Data: Can use Minitab: Stat>Quality Tools>Capability Analysis>Bin omial (for defective units or on e defect opportu nity per unit) Warnin g: you h ave to manually add 1.5 to Z from Mini tab to get Z.st from t he Six Sigma Project Guid e. 33 | MDT Confidential

Attribute Data ATTRIBUTE PROCESS CAPABILITY Opps per unit is the number of opportunities per unit to have this particular defect. A uni t may have m ore t han o ne op por tun it y t o hav e a spec ifi c def ect. (It is conservative to assume only 1 opportunity of a defect per unit) The Six Sigma Project Guide is used to carry out the c apability calculations. The icon for this Guide looks like this:

Open up this Project Guide and click on init ial capability icon

Note: Z.ST stands for Z short term whic h is a common measure to use in Six Sigma.


17

Attribute Data Example: Capabili ty for Attrib ute Data Project Goal: Improve Freestyle first pass yield from 10% to 20%.

50 Freestyles inspected 44 defective What is initial capability?

Defects

Opps per Uni t

Uni ts

44 NA

1 NA

50 NA

Z.ST

Z.ST 95% Upper

Z.ST 95% Lower

0.33

0.80

-0.19

Ini tial Capability Final Capabil ity

Ini tial Capability

Based on n=50, we are 95% confident: Z.ST < 0.80, Z.St > -0.19


Attribute Data Example: Capability f or Att ribut e Data Graph of Initial vs Final Capability (good way to present capability!)

Capability for Attribute Y

Project Goal (%Defective)

80.000%

Defects

Opps per Uni t

Uni ts

Total Opps

DPMO

44 80

1 1

50 100

50 100

880000 800000

Z.ST

Z.ST 95% Upper

Z.ST 95% Lower

Project Goal Z.ST

Initial Capability

0.33

0.80

-0.19

0.658

Final Capability

0.66

0.95

0.36

0.658

Initial Capability Final Capability

Final Capability: # Units is arbitrary (since we don’t have any final data yet) # Defects = # Units * Project Goal % = 100 * 80% = 80 (assumes goal is met)


18

Attribute Data Att ri bu te c apab il ity can be exp ressed as: -a proportion defective with a co nfidence interval -a Z with a confidence interval Example: Capability for Attribute Data (cont) 100%

Initial vs Final Capability %Defective with 95%Confidence Bounds

% Defective is a good way to explain capability

ProjectGoal

80% e v i t 60% c e f e D % 40%

6

Initial vs Final Capability Z.ST with 95% Confidence Bou nds ProjectGoal

20%

5 0%

4 In it ial Capabi li ty

Fi nal Capabi li ty

T S . 3 Z

2

But in Lean Sigma they like Z

1 0 In itial C apabil ity

Fi nal C apabili ty


Attribute Data For baseline capability: 44/50 defective units (one opportunity per unit) Inputs: Minitab: Stat>Quality Tools>Capability Analysis>Bino mial


19

Attribute Data For baseline capability: 44/50 defective units (one opportunity per unit) Outputs: Binomial Process Capability Analysis of Defectives P Chart 1.0

s e v 47.5 i t c e f e 45.0 D d e 42.5 t c e p 40.0 x E

UCL=1

n o i t r 0.9 o p o r P 0.8

_ P=0.88

LCL=0.7421

Binomial Plot

40 45 50 Observed Defectives

1 Sample Cumulative %Defective

Histogram Summary Stats

95

1.00

(95.0% confidence)

e 90 v i t c e f e 85 D % 80 75 0.98

0.99

1.00 Sample

1.01

1.02

% De fectiv e: Low er C I: U pper C I: Target: P P M Def: Low er C I:

88.00 75.69 95.47 0.00 880000 756899

U pper C I: P rocess Z: Low er C I: U pper C I:

954665 -1.1750 -1.6919 -0.6964

0.75 y c n e u 0.50 q e r F 0.25 0.00

88 %Defective

Note: Add 1.5 to Minitab Z outputs to get the Z.st & CI Z.st for baseline 44/50. 39 | MDT Confidential

Attribute Data .

Exercise: Capabil ity fo r Attri bute Data Problem Statement: Expense reporting first pass yield is too low. Project Goal: Improve first pass yield from 70% to 85%.

Submitted Reports Defects Opportunities per Report

200 61 1

Task: Determine submitted reports capability Appr oach: Work alone or in small groups.


20

At tr ibute Data: Manufact urin g Yield First -Pass Yield (%) by Operational Step FPY = (Good / Attempts)*100 Att empts

OP 10

PRB

Scrap

Good

Rew ork

Rolled-Throughput Yield (%) by Produ ct


RTY =

FPYi ) = FPY1 x FPY2 x FPY3 x. . .

Individuals Control Chart Individuals Control Chart of Dail y FPY 100.0 97.5

_ X=96.55

) 95.0 % ( Y P F 92.5 y l i a D

LCL=90.70

90.0 87.5 85.0

1

7

13

19

25

31 Day

37

43

49

55

Purpose: (1) fro m baseline data, determine threshold limits for prospective monitori ng 42 | MDT Confidential

21

Individuals Control Chart Individuals Control Chart of Dail y FPY 100.0 97.5

_ X=96.55

) 95.0 % ( Y P F 92.5 y l i a D

LCL=90.70

90.0

87.5 NOTE: A statisti cally stable proc ess is in contr ol, displaying a consi stent pattern of variation over time. The variation exhibi ted by a stable process is 85.0 considered to b e due to chance or common causes that are 1 7 13 19 25 31 37 43 49 55 inherent to the design of the systemDay (produ ct and process). Therefore, a stable process is operating to its full potential by design. If we desire better perfo rmance (incr ease mean FPY, or redu ce variation), then a change to the system is required. What type of changes may be effective? Who is responsible for excecuting changes to the system? 43 | MDT Confidential

Individuals Control Chart Purpose: (2) quanti fy pro cess stabili ty by comparing two estimates of variation: Lon g-Term: Sample Standard Deviati on Short -Term: Average Movi ng Range / 1.128 Stability Index = Long-Term

/ Shor t-Term

Process i s Unstabl e When Stability Ind ex > > 1.0


22

Example A I-MR Chart of Daily FPY 100 _ X=96.51

) % ( 95 Y P F y l i a 90 D

LCL=90.58

85 1

7

13

19

25

31 Day

37

43

49

55

8 UCL=7.288

e 6 g n a R4 g n i v o M2

__ MR=2.231

0

LCL=0 1

7

13

19

25

31 Day

37

43

49

55

I Chart (Long -Term): S = 2.013 MR Chart (Short -Term ): S = 2.231 / 1.128 = 1.98 Stabili ty Index = 2.013 / 1.978 = 1.02 45 | MDT Confidential

Example B I Chart of Daily FPY _ X=97.0

100

LCL=82.6

80

) % ( 60 Y P F y l i a 40 D

1

20

0

UB=0

1

1

7

13

19

25

31 Day

37

43

49

55


23

Example B I Chart of Daily FPY _ X=97.0

100

LCL=82.6

80

) % ( 60 Y P F y l i a 40 D

1

20

0

1

1

7

13

19

25

31 Day

37

NOTE: Variation that exceeds statistical control li mits should be treated as due to the presence of a special cause; local action should be taken to investigate, determine root cause, and prevent reoccurrences. UB=0 43

49

55


Example C I Chart of Daily FPY _ X=99.32

100

LCL=97.40 95

) % ( 90 Y P F y l i a D 85 1

80

1

75 1

7

13

19

25

31 Day

37

43

49

55


24

Macro View of FPY by Op AVERAGE FPY vs. STABILITY INDEX 100

C

99 Y P F E 98 G A R E V A 97

B

A 96

1

2


3

4

5

6

STA BILITY INDEX

Improvement Strategy AVERAGE FPY vs. STABILITY INDEX Capable But Periodically Unstable

100

Identify & Remove Special Causes Daily: MTM/Supervisors

99 Y P F E 98 G A R E V A 97

C

Stable but Chronically Less Capable Change System (Projects)

A

B

Monthly, Quarterly: Ops Mgmt, Engr

96

1 50 | MDT Confidential

2

3

4

5

6

STA BILITY INDEX

25

Non-normal Data Dataset: DISTSKEW.MTW Variables: Pos Skew (column B) Objective: Determine Cpk with Specs: 5-50 Pathway: Stat/Basic Statistics/Graphical Summary Inputs: select variable Pos Skew t o analyze Is this data normally distributed? Pathway: Graph/Probability Plot (Test for Normality, default option) Inputs: select variable Pos Skew to analyze Two plot layout: Right click on folder icon on toolbar to left of “ i” toolbar symbol Hold down control key and left click on two graph names, right click on the graph names to get layout tool and click on finish. Layout tool results:

Can we compute Cpk? 51 | MDT Confidential

Non-normal Data CPK FOR NON-NORMAL DISTRIBUTION Dataset: DISTSKEW.MTW Variables: Pos Skew (column B) Box/Cox transformation :Pathway: Stat/Control Charts/Box Cox Inputs: all obs i n one column/ select variable ‘Pos Skew” /Subgroup Size 1 Johnson Transformation: Pathway: Stat/Quality Tools/Johnson Transformation Inputs: select variable Pos Skew to analyze Merged layout:

= 0.0


26

Non-normal Data BOX- COX TRANSFORMATION BOX COX Table of Transformations ______________________________________________________________________

Transformation  ______________________________________________________________________ 1 1/2 0 -1/2 -1

No transformation Square root Log Reciprocal Square Root Reciprocal

Example of Minitab Box-Cox Input Screen with Lambda=0


Non-normal Data CPK WITH TRANSFORMED DATA What is the Cpk for DistSkew Data Set? Pathway: Stat/Quality tools/Capability Analysis/Normal Inputs: select variable “Pos skew” , subgroup size 1, LSL=5,USL=50 AND cl ick on B ox-Co x bu tt on an d sel ect “ Use op ti mal l ambd a”

Recall = 0 is the log transformation of th e data. Cpk= __________.


27

Non-normal Data CAPABILITY WITH TRANSFORMED DATA Capability SixPack Pathway: Stat/Quality tools/Capability Sixpack/Normal Inputs: select variable “Pos skew” , subgroup size 1, LSL=5,USL=50 AND cl ic k on Box -Cox b utt on an d s elect “ Use op ti mal l amb da”


Non-normal Data CAPABILITY WITH RAW DATA What is the Capability fo r DistSkew Data Set? Pathway: Stat/Quality tools/Capability Analysis/Nonnormal Inputs: select variable “Pos skew” , subgroup size 1, LSL=5,USL=50 AND cl ic k sel ect t he rad io b ut ton dis tr ibu ti on w ith pul l do wn o f l ogn or mal Output:


28

Non-normal Data Capability Normal Branch with Box-Cox vs Log-Normal

1) the ppm “Observed” stay the same when you fit the log-normal using either the normal or non-normal capability branch. Actually, the ppm observed will stay the same no what distribution you fit to the data. 2) the ppm “Expected Overall” stays the same when you fit the log-normal using either the normal or non-normal capability branch. 3) The Ppks can be very different between using the “capability normal” branch (with the Box-Cox transform) vs using the “capability nonnormal” (using lognormal fit) because the “ capability nonnormal” branch uses the ISO definit ion of Ppk

4) The “capability nonnormal” has no Cpk. Just Ppk. And it has no confidence interval for Ppk either.


Minitab Assistant vs Method Chooser Minitab Method Chooser Flowchart


29

Minitab Assistant Flow Chart


Minitab Assistant: Continuous Data • Minitab Assistant wants 100 or more data points. • Minitab tests (AD) for normality at the .05 level. • Minitab Assistant uses THREE rules to check for stability of the process: – –

Test 1: Point out side control limits Test 2 : Nine points in a row on the same side of the centerline – Test 7(Modified):12-15 points within one sigma of the centerline

• Minitab Assistant info: http://www.minitab.com/enCN/support/answers/answer.aspx?ID=2613&langT ype=1033


30

Minitab Assistant


Minitab Assistant – Normal Distribution • Use CABLE.MTW • Dataset has 100 measurements of the diameter of a cable wire – 20 hourly samples of n=5 • The engineering specification for this diameter is 0.55 +/- 0.05 cm. • Our task is to conduct a capability analysis of this process.


31

Minitab Assistant – Normal Distribution


Minitab Assistant – Normal Distribution Capability Analysis for Diameter Report Card C he ck

S ta tu s

S t ab ilti y

Capability Analysis for Diameter Process Performance Report

D es cr ip ti on T h e pr o ce s s m ea n a nd v a ri at io n a r e s ta bel . N o p oint s a re o u t of c o nt r ol.

Number of Subgroups

i

Normality

CapabilityHistogram Are the data ni side the ilmits?

You only have 20 subgroups. Fora capability analysis, it si generally recommendedthat youcollect at least 25 subgroups overa long enoughperiod of time to capture the different sources of process variation.

ProcessCharacterization

LSL

USL

Your data passed the normality test. As ol ngas you have enoughdata, the capability estimates should be reasonably accurate.

Amount of Data

The total numberof observations is 100 or more. The capability estimates shouldbe reasonably precise.

Total N Subgroup size Mean StDev (overall) StDev (within)

100 5 0.54646 0.019341 0.018548

CapabilityStatistics

Capability Analysis for Diameter Diagnostic Report 0.50

Xbar-RChart Confirm that theprocess is stable.

0.52

0.54

0.56

0.58

0.60

Actual(overall) Pp P pk Z.Bench % Out of spec (observed) % Out of spec (ex pec e t d) PPM (DPMO) (observ ed) PPM (DPMO) (expect ed) Potential(within) Cp Cpk Z.Bench % Out of spec (ex pec e t d) PPM (DPMO) (expected)

0.86 0.80 2.29 2.00 1.10 20000 10969 0.90 0.83 2.41 0.81 8072

Actual(overall)capability iswhat thecustomerexperiences.

0.56

Potential(within)capabilityiswhatcould beachievedif process shiftsand driftswereeliminated.

n a e M 0.54

Capability Analysis for Diameter Summary Report

0.52 0.10

Customer Requirements How capable is the process? 0

e g n 0.05 a R

6

Low

High

Upper Spec Target Lower Spec

0.6 * 0.5 ProcessC haracterization

Z.Bench = 2.29

Mean Standard d ev iation

0.00 1

3

5

7

9

11

13

15

17

19 Actual(overall) Capability Are the data insidethe limits? LSL

NormalityPlot Thepointsshould becloseto theline.

USL

0.86 0.80 2.29 1.10 10969

Comments

NormalityTest

Conclusions -- The defect rateis 1.10%,which estimates the percentage ofparts from the process that are outsidethe speclimits.

(Anderson-Darling) Results P-value

0.54646 0.019341

Actual(overall)capability Pp P pk Z .Bench % Out of spec PPM (DPMO)

Pass 0.794

Actual(overall) capability is what the customer experiences.

0.50

0.52

0.54

0.56

0.58

0.60


32

Minitab Assistant: Non-Normal Data

• Use TILES.MTW • Choose Minitab Assistant – Capability Analysis • Detects non-normality and offers the option of transfomation (Box-Cox)


Minitab Assistant: Non-Normal Data

Capability Analysis for Warping Report Card Check

S tatus

Stabilit y Number of Subgroups Normality Amount of Data

Des cr ipti on The process mean and varia tio n are stable . No poin ts are out of control.

i

You only have 10 subgroups. For a capability analysis, it is generally recommended that you collect at least 25 subgroups over a long enough period of time to capture the different sources of process variation. The transformed data passed the normality test. As long as you have enough data, the capability estimates should be reasonably accurate. The total number of observations is 100 or more. The capability estimates should be reasonably precise.


33

Minitab Assistant: Non-Normal Data Capability Analysis for Warping Diagnostic Report

Capability Analysis for Warping Summary Report

Xbar-S Chart Confirm that the process is stable.

CustomerRequirements

4

n a 3 e M

Howcapableis the process? 0

2

6

Low

3

High

Upper Spec Target Lower Spec

v 2 e D t S

8 * * Process Characterization

Z.Bench = 2.24

Mean Standard dev iation

1 1

2

3

4

5

6

7

8

9

NormalityPlot (lambda= 0.50) The points should beclose to the line.

Actual(overall) Capability Are the databelow the ilmit?

NormalityTest (Anderson-Darling) Orig inal

Transformed

Fail 0.010

Pass 0.574

Results P-value

2.9231 1.7860

10

USL

Actual(overall) capability Pp P pk Z .Bench % Out of spec PPM (DPMO)

* 0.75 2.24 1.26 12569

Comments Conclusions -- Thedefect rate is 1.26%, which estimates the percentage ofparts from the process that areoutside the speclimits.

Capability Analysis for Warping Process Performance Report

CapabilityHistogram Are thedata belowthe limit?

Actual(overall) capability is what thecusto mer experiences.

ProcessCharacterization USL

Total N Subgroup size

100 10 0.0

CapabilityStatistics Actual(overall) Pp P pk Z .Bench % Out of spec (observed) % Out of spec (expected) PPM (DPMO) (observed) PPM (DPMO) (ex pected) Potential(within) Cp C pk Z .Bench % Out of spec (expected) PPM (DPMO) (ex pected) 0.0

1.5

3.0

4.5

6.0

7.5

1.5

3.0

4.5

6.0

7.5

* 0.75 2.24 2.00 1.26 20000 12569 * 0.76 2.28 1.12 11249

TransformedData

Actual(overall)capability is what thecustomer experiences. Potential(within)capability is what could be achieved ifprocess shifts and drifts were eliminated.


Confidence Limits: NOT IN ASSISTANT • Stat -> Quality Tools -> Capability Analysis -> Normal – Follow this path if you need to calculate the Lower Confidence Bound on Cpk or Ppk Note: The Normal branch has Box-Cox transformations (for non-normal data) that allows you to get Cpk and Ppk and confidence intervals for Cpk & Ppk on the transformed scale. Note: There is no Cpk or a confidence interval for Ppk if you use the Non-Normal branch. Note: The Minitab Assistant DOES NOT give confidence limits for Capability indicies. It allows you to use the Box-Cox transform when it detects non-normal data.

.


34

Confidence Limits: Normal Case

Select one-sided lower limit 69 | MDT Confidential

Confidence Limits: Normal Case Process Capability of Diameter (using 95.0% confidence) LSL

USL Within Overall

Process Data LSL 0.5 Target * U SL 0.6 S a mp le M e a n 0 .5 46 46 Sample N 100 StDev(Wit hin ) 0.0185477 StDev(Overall) 0.0193414

Potential (Within) Ca pability Cp 0.90 Lo w er C L 0 .7 8 C PL 0.83 C PU 0.96 C pk 0.83 Lo w er C L 0 .7 1

Cpk and 95% Lower confidence limit for Cpk

Ov erall Capability

0.50 Observed Performance P P M < L S L 1 00 00 .0 0 PPM > USL 10000.00 P P M To ta l 2 00 00 .0 0

0.52

Exp. Within Performance P P M < LS L 6 12 4. 50 PPM > USL 1947.11 P P M To ta l 8 07 1. 61

0.54

0.56

0.58

Exp. O verall Performance P P M < LS L 8 150 .5 7 P P M > U SL 2 818 .7 1 P P M T ot al 1 09 69 .2 8

0.60

Pp Lo w er C L PPL PPU Ppk Lo w er C L C pm Low er C L

0.86 0 .7 6 0.80 0.92 0.80 0 .6 9 * *

Ppk and 95% Lower confidence limit for Ppk


35

Confidence Lim its: Norm al Case LOWER 95% CONFIDENCE FOR OBSERVED CPK Obs Cpk

10

20

30

40

50

75

100

150

200

0.5

0.24

0.32

0.35

0.37

0.39

0.41

0.42

0.43

0.44

0.6

0.31

0.40

0.44

0.46

Sample Size (n)

0.47

0.50

0.51

0.53

0.54

0.7 0.8

0.38 0.44

0.48 0.55

0.52 0.60

0.54 0.63

0.56 0.65

0.59 0.67

0.60 0.69

0.62 0.71

0.63 0.72

0.9 1.0

0.51 0.58

0.63 0.71

0.68 0.76

0.71 0.79

0.73 0.82

0.76 0.85

0.78 0.87

0.80 0.89

0.82 0.91

1.1

0.64

0.78

0.84

0.88

0.90

0.94

0.96

0.99

1.00

1.2

0.70

0.86

0.92

0.96

0.99

1.03

1.05

1.08

1.09

1.3

0.77

0.93

1.00

1.04

1.07

1.11

1.14

1.17

1.19

1.4

0.83

1.01

1.08

1.13

1.15

1.20

1.23

1.26

1.28

1.5 1.6

0.89 0.96

1.08 1.16

1.16 1.24

1.21 1.29

1.24 1.32

1.29 1.37

1.32 1.41

1.35 1.44

1.37 1.46

1.7

1.02

1.23

1.32

1.37

1.41

1.46

1.49

1.53

1.55

1.8

1.08

1.30

1.40

1.45

1.49

1.55

1.58

1.62

1.65

1.9 2.0

1.14 1.21

1.38 1.45

1.48 1.56

1.54 1.62

1.57 1.66

1.64 1.72

1.67 1.76

1.71 1.80

1.74 1.83

2.1 2.2

1.27 1.33

1.53 1.60

1.64 1.71

1.70 1.78

1.74 1.83

1.81 1.90

1.85 1.94

1.89 1.99

1.92 2.01

2.3

1.39

1.67

1.79

1.86

1.91

1.98

2.03

2.08

2.11

2.4 2.5

1.45 1.52

1.75 1.82

1.87 1.95

1.94 2.03

1.99 2.08

2.07 2.16

2.11 2.20

2.17 2.26

2.20 2.29

2.6

1.58

1.90

2.03

2.11

2.16

2.24

2.29

2.35

2.38

2.7

1.64

1.97

2.11

2.19

2.24

2.33

2.38

2.44

2.47

2.8 2.9

1.70 1.76

2.04 2.12

2.19 2.27

2.27 2.35

2.33 2.41

2.42 2.50

2.47 2.56

2.53 2.62

2.57 2.66

3.0

1.82

2.19

2.34

2.43

2.50

2.59

2.65

2.71

2.75

3.1

1.89

2.26

2.42

2.52

2.58

2.68

2.73

2.80

2.84

3.2

1.95

2.34

2.50

2.60

2.66

2.76

2.82

2.89

2.93

3.3 3.4

2.01 2.07

2.41 2.48

2.58 2.66

2.68 2.76

2.75 2.83

2.85 2.94

2.91 3.00

2.98 3.07

3.03 3.12

3.5

2.13

2.56

2.74

2.84

2.91

3.02

3.09

3.16

3.21

3.6 3.7

2.19 2.25

2.63 2.71

2.82 2.89

2.92 3.01

3.00 3.08

3.11 3.20

3.18 3.26

3.25 3.34

3.30 3.39

3.8

2.32

2.78

2.97

3.09

3.16

3.28

3.35

3.44

3.48

3.9

2.38

2.85

3.05

3.17

3.25

3.37

3.44

3.53

3.58

4.0

2.44

2.93

3.13

3.25

3.33

3.46

3.53

3.62

3.67


EXERCISE: Confi dence Bound for Cpk Simulation Study of Cpk :

1)Simulate 10,000 rows with 5 columns of a normal distribution with mean = 10

and std . dev. = 1.

2) Compute the mean and standard deviation for each row. 3) Use LSL= 7, USL = 13. and compute Cpl & Cpu for each row. 4) Take the min of Cpl & Cpu to get Cpk for each row. 5) Make a histogram of the simulated Cpks. Does the distribution of simulated Cpks look normal?. What should the theoretical Cpk be from the mean, standard deviation and specs? What is the distribution of Cpk lower bounds? How often does the lower confidence bound contain the “true” value for Cpk?


36

Simulation Results for n=5 Summary for Sample Cpk Estimated from n=5 Mu=10, Sigma=1, LSL=7, USL=13, Population Cpk = 1.0 A nderson-Darling Norma lity Test

1.1

2.2

3.3

4.4

5.5

6.6

A -Squared P -V alue <

436.22 0.005

M ean StDev V ariance S kew ness Kurtosis N

1.1036 0.5573 0.3106 2.7486 13.4533 10000

M inimum 1 st Q u ar ti le M edian 3 rd Q u a rt il e M aximum

7.7

0.3191 0 .7 54 2 0.9676 1 .2 81 6 7.8123

95% C onfidence Interval for Mean 1.0926

1.1145

95% C onfidence Interval for Median 0.9589 95% Confidence Intervals

0.9761

95% C onfidence Interval for StDev 0.5497

Mean

0.5651

Median 0.950

0.975

1.000

1.025

1.050

1.075

1.100

In theo ry, Cpk = min ( (13 – 10)/(3*1) , (10-7)/(3*1) ) = 1.000 73 | MDT Confidential

Simulation Results for n=5, 10, 20, 30, 50, 100 Histogram of Cpk 5

Histogram of Cpk 10

Normal

Normal

1000

M ea n

1 .1 04

StDev N

0.5573 10000

y c n e u q e r 500 F

1600

M ea n

10 .0 7

S tD e v N

0 .2 85 3 10000

y c n e u q 800 e r F

0

0 0.0

1.1

2.2

3.3 4.4 Cpk 5

5.5

6.6

7.7

0.7

1.4

2.1

Histogram of Cpk 20

2.8 Cpk 10

3.5

4.2

4.9

Histogram of Cpk 30

Normal

Normal M ea n

0 .9 80 2

StDev N

0.1763 10000

400

y c n e u q e 200 r F

500

M e an

0 .9 77 0

S tD e v N

0 .1 41 1 10000

y c n e u q e 250 r F

0

0 0.66

0.88

1.10

1.32 Cpk 20

1.54

1.76

1.98

0.64

0.80

0.96

Histogram of Cpk 50

1.12 Cpk 30

1.28

1.44

1.60

Histogram of Cpk 100

Normal

Normal M ea n StDev N

500

y c n e u q e r 250 F

0 .9 78 8 0.1053 10000

M ea n S tD e v N

500

0. 982 5 0 .0 73 76 10000

y c n e u q 250 e r F

0

0 0.72

0.84

0.96

1.08 Cpk 50

1.20

1.32

1.44

0.72

0.80

0.88

0.96 1.04 Cpk 100

1.12

1.20

1.28


37

Simulation Results for n=5, 10, 20, 30, 50, 100 Boxplot of Sample Cpk Estimates by Sample Size 8

Ass umpt io ns : Normal = 10 =1 LSL = 7 USL = 7 True Cpk = 1.0

7 6 5

a t a 4 D 3 2 1

1

0 Cpk 5

Cpk 10

Cpk 15

Cpk 20

Cpk 25

Cpk 30

Cpk 50

Cpk 100


Simulation Results for n=5, 10, 20, 30, 50, 100 Boxplot of Cpk Lower Bounds by Sample Size 3.5

Assum pt io ns: Normal = 10 =1 LSL = 7 USL = 7 True Cpk = 1.0

3.0 2.5 2.0

a t a D 1.5 1.0

1

0.5 0.0 Cpk LB 5

Cpk LB 10

Cpk LB 15

Cpk LB 20

Cpk LB 25

Cpk LB 30

Cpk LB 50 Cpk LB 100


38

Performance of Cpk Lower Confidence Bound 95% Lower Confidence Bound for Cpk Miss Rate vs. Population Mean LSL 7

(Population Sigma Varies to Make Cpk=1.0) 10 N 5 10 15 20 25 30 50 100

6.00% 5.00%

5.00%

4.00%

e t a R 3.00% s s i M 2.00% 1.00%

Target Nominal

0.00%

USL=13 7.0

7.5

8.0 8.5 9.0 True Population Mean

9.5

10.0


Confidence Lim its: Norm al Case LOWER 95% CONFIDENCE FOR OBSERVED CPK

Simulation Results • Formula is co nservative when pr ocess mean i s on target (better than 95% coverage of true Cpk value).

• As process mean deviates fr om target, formula pro vides approximately th e stated reliabil ity in perfor mance (95%), regardless of s ample size. 78 | MDT Confidential

39

Relations hip Between Cpk & Tolerance Intervals (Confi dence/Reliabili ty Levels)


At tr ibute Sample Sizes Usi ng c=0 Plan s NUMBER OF TESTS WITHOUT FAILURE VS RELIABILITY AND CONFIDENCE Confidence Level (%) 80 85 90

Reliability

50

60

70

75

95

97.5

99

99.5

99.9

0.999999

693147

916291

1203973

1386294

1609438

1897120

2302584

2995731

3688878

4605168

5298315

6907752

0 .99 99 9

69 315

91 62 9

1 20 39 7

13 8629

1 609 43

1 89 71 2

23 02 58

2 99 572

3 68 88 7

4 605 15

52 983 0

6 90 77 3

0. 9999

6932

9163

12040

13863

16094

18971

23025

29956

36887

46050

52981

69075

0.999

693

916

1204

1386

1609

1897

2302

2995

3688

4603

5296

6905

0.998

347

458

602

693

804

948

1151

1497

1843

2301

2647

3451

0.997

231

305

401

462

536

632

767

998

1228

1533

1764

2300

0.996

173

229

301

346

402

474

575

748

921

1149

1322

1724

0.995

139

183

241

277

322

379

460

598

736

919

1058

1379

0.994

116

153

201

231

268

316

383

498

613

766

881

1148

0.993

99

131

172

198

230

271

328

427

526

656

755

984

0.992

87

115

150

173

201

237

287

373

460

574

660

861

0.991

77

102

134

154

179

210

255

332

409

510

587

765

0.99

69

92

120

138

161

189

230

299

368

459

528

688

0.98

35

46

60

69

80

94

114

149

183

228

263

342

0.97

23

31

40

46

53

63

76

99

122

152

174

227

0.96

17

23

30

34

40

47

57

74

91

113

130

170

0.95

14

18

24

28

32

37

45

59

72

90

104

135

0.94

12

15

20

23

27

31

38

49

60

75

86

112

0.93

10

13

17

20

23

27

32

42

51

64

74

96

0.92

9

11

15

17

20

23

28

36

45

56

64

83

0.91

8

10

13

15

18

21

25

32

40

49

57

74

0.9

7

9

12

14

16

19

22

29

36

44

51

66

0.8

4

5

6

7

8

9

11

14

17

21

24

31

0.7

2

3

4

4

5

6

7

9

11

13

15

20

0.6

2

2

3

3

4

4

5

6

8

10

11

14

1

2

2

3

3

3

4

5

6

7

8

10

0.5


40

Producer ’s Risk Of Using c=0 Plans Sample Size Required

Reliability Level Required to Pass

0 Failures Allowed Reliability

Confidence 90 95

50% Chance 95% Chance Confidence 90 95

Reliability

Confidence 90 95

0.999

2302

2995

0.999

0.99970

0.99977

0.99998

0.99998

0.997

767

998

0.997

0.99910

0.99931

0.99993

0.99995

0.99

230

299

0.99

0.9970

0.9977

0.9997

0.9998

0.95

45

59

0.95

0.9847

0.9883

0.9989

0.9991

0.90

22

29

0.90

0.9690

0.9764

0.9977

0.9982

0.80

11

14

0.80

0.9389

0.9517

0.9950

0.9963

NOTE: The term reliability in a compliance testing context refers to conformance to design requirements, not to the actual device field perfor mance level. The difference is due to unaccount ed design margin between spec limits and the variation required to degrade field performance. c=0 Plans Maximize The Chances Of A Good Pro cess Failing The Study 81 | MDT Confidential

The Big Picture of Compliance Testing BEFORE Characterization Studies

• Inject sources of variation to stress system

• Experimentation (DOE) • Simulation Modeling • Measure design margin

DURING

AFTER

Qualification

Process

Studies • n delivers co nf%/rel%

Stability

• All so ur ces o f

• Limited conditions

variation w ill be acting over the long term in the future

• Representative

• Need to detect

sample?

significant changes

• One time point

• Optimization 82 | MDT Confidential

41

How To Move Away From A Compliance Testing Culture Toward A Capability Culture Identify Critical Requirements 1. Perform thorough characterization studies; inject sourc es of variation, test to failure 2. Demand variables data, system performance modeling, measure design margin, robust design, optimi zation: then you can skip compl iance testing! 3. If variables data is unavailable, challenge that! 4. For attribute data: select risk-based confidence/reliability levels, perform com pliance testin g or else cite the work done during characterization! 5. Control processes to ensure that our system robustness does not deteriorate over time and th at we are alerted to assignable causes of variation if th ey occur 83 | MDT Confidential

Impact of Measur ement Err or (Imprecision) 2

observed

2

=

product

+

2

measurement error

Process capability study variation is inflated by measurement error (gage repeatability & reproducibility). Therefore, if an independent gage R&R study has been completed, then subtract the measurement error from the observed process capability variation to estimate true product variation: 2

product

=

2

observed

-

2

measurement error


42

Capability Analysis: Summary PROCESS CAPABILITY: THE NATURAL VARIABILITY IN A PROCESS. VARIABLES DATA Cp, Pp: Measure of process potential (for a centered proc ess) Cpk, Ppk: Measure of actual process capability PPM estimates from PROCESS CAPABILITY INDICES ASSUME THAT THE PROCESS IS STABLE AND FOLLOWS A NORMAL (BELL-SHAPED) DISTRIBUTION. Make sure there are no shif ts or tr ends If the data are not normal try other parametric distribution s (Weibull or lognormal) or Box-Cox transformation If those fail, consider the data as attribute Consider the impact of sample size and how the data was collected (short-term vs. lo ng-term) when making in ferences – use confidence bounds to incorporate uncertainty in estimates ATTRIBUTE DATA The prop orti on (p bar) from the P chart is the process capability. 85 | MDT Confidential

Summary Quiz True or False

___________

___________

___________

You can ignore plotting the data and just compute Ppk. The “Total Exp. Overall” ppm are the same for log-normal data in Minitab for both of these approaches: 1) the Normal Capability branch (with lambda=0) and 2) the non-normal Capability branch and selecting Log-normal. The smaller the sample used to compute Ppk, the better. It is less work to collect the data .


43

Summary And Recap Measuring Process Capabili ty Sigma Scale, Z scores, DPM=PPM • Process Capability Indices • (Cp, Cpk, Pp, Ppk) • Impact of Normality & Process Stabili ty Attr ib ute Data Non-normal Data Minitab Assistant Impact of Sample Size (Confidence Limit s) Comparison to Tolerance Intervals Impact of Measurement Error


44

Chapter 4B: Tolerance Intervals

Topics • Tolerance Intervals – Calculations – Sample Size


1

Statistical Tolerance Intervals • From the Medtronic Handbook of Statistics: – For variables data, a statistical tolerance interval places limits on the variation expected in individual items from a population. – A tolerance interval is described by two parameters: confidence level and population fraction (sometimes called “reliability”, for fraction meeting spec(s) )


Tolerance Intervals – New in Minitab 16 • A new feature in Minitab 16 is the calculation of tolerance intervals using a normal-distribution assumption. – The normal distribution assumption is critical. Unlike confidence intervals which are somewhat unaffected by lack of normality, tolerance intervals are completely dependent upon it.


2

Normal distribution Tolerance Interval


Statistical Tolerance Intervals • From the Medtronic Handbook of Statistics, cont’d: – If the data is not normal, transformations should be tried to obtain normality. For example, if the data were lognormal then tolerance intervals could be constructed on the log of the data. – If the underlying population distribution is known but is not normal then reliability/distribution analysis techniques can be used. – Tolerance Intervals generally should be • Two-sided if the specification is two-sided • One-sided if the specification is one-sided


3

Tolerance Interval Calculation • First determine appropriate data distribution or transformation • For Normal distribution or transformation to normal distribution – Use Stat -> Quality Tools -> Tolerance Interval

• For other distribution (e.g. Weibull) – Use Stat -> Reliability/Survival -> Parametric Distribution Analysis


Example: Tolerance Intervals • Use Ch1DataFile.mtw • Variables TubeTensile1, TubeTensile2, TubeTensile3


4

Step 1: Identify Distribution • Stat -> Basic Statistics -> Normality Test


Step 2: Calculate tolerance bound Lower Tolerance Bound for TubeTensile1

5

Tolerance Interval Output – TubeTensile1


A Very Confusing Output in Minitab 16 Tolerance Interval Plot for Tensile Bond 1_1 95% Lower Bound At Least 95% of Population Covered Statistics N M ea n StDev

30 4 1. 06 3 10.127

Normal 0

20

40

60

Lower

18.583

Nonparametric Normal

L o we r

7 .5 60

Normality Test

Nonparametric 0

10

20

30

40

50

60

AD P-Value

70

0.772 0.040

Normal Probability Plot 99 t n e c r e P

90 50 10 1 0

10

20

30

40

50

60

70

Tell everyone you know who uses Minitab: The 95%/95% statement on the display ONLY applies to the Normal-distribution interval, not the Nonparametric Interval

Must Look in Session Window:


6

Try this using Summarized Data


Try using other Sample Sizes

95/95 Nonparametric One-sided requires n=59 95/95 Nonparametric Two-sided requires n=93 14 | MDT Confidential

7

Exercise • Compute 95/95 lower tolerance bounds for – TubeTensile2, TubeTensile3

• Compute 95/90 lower tolerance bounds for – TubeTensile2, TubeTensile3

• Compute 95/95 two-sided tolerance intervals for – TubeTensile2, TubeTensile3


Using “Summarized Data” option to Evaluate Sample Size for Tolerance Intervals • Imagine having the following historical data on the pull strength of an electrode to plan a future study using Tolerance Intervals – – – –

A normal distribution assumption is appropriate The historical mean is 4.92 lbs The historical standard deviation is 0.87 lbs The lower specification limit for pull strength is 2.0 lbs


8

Sample Size Evaluation for NormalDistribution Tolerance Intervals • Ask: – How likely are these results to predict the results of the future study? – Will the future study run under the same conditions? Worst case? – Would that affect the mean or standard devation we expect?

Sample Size for Normal Distribution Tolerance Intervals • For example, might decide to use a larger standard deviation value, say 1.10 (approximately 25% larger) as the planning value • Need to know confidence and reliability to demonstrate. For example, let’s use 95% confidence and 95% reliability. • Start with n=30 and see if that sample size would be large enough . . .


9

Sample Size for Normal Distribution Tolerance Intervals

Since the one-sided tolerance interval is above the specification value of 2.0, n=30 is large enough

Now try smaller sample sizes . . .

n=14 is the smallest sample size that produces an interval above 2.0

10

Exercises • Choose a sample size for – Normal distribution tolerance interval – One-sided specification: Min 3 lbf – Planning data: TubeTensile3

• Choose a sample size for – Normal distribution tolerance interval – Two-sided specification: 3.5 to 4.0 – Planning data: Spacing4


Tolerance Interval for Non-Normal Distributions • If a normal-distribution model is not appropriate for the data, then either – Transform the data to Normal • Use the (normal distribution) Tolerance Interval module

– Or identify a non-normal distribution (e.g. Weibull) • Use Stat -> Reliability/Survival -> Parametric Distribution Analysis • Use confidence intervals on percentiles to determine the Tolerance Interval Limits


11

Tolerance Intervals via Reliability/Survival Menu One-sided

• Lower 95% / 95%: – Calculate one-sided lower 95% confidence bound on the 5th percentile

• Upper 95% / 95%: – Calculate one-sided upper 95% confidence bound on the 95th percentile

Two-sided

• Two-sided 95% / 95%: – Calculate two-sided confidence intervals for 2.5th and 97.5th percentiles. – Lower bound is the lower 95% bound on the 2.5 th percentile – Upper bound is the upper 95% bound on the 97.5 th percentile


Weibull Tolerance Interval • Use Stat -> Quality Tools -> Individual Distribution Identification • Data was randomly generated from Weibull with shape 2 and scale 25. • All except Normal fit well • Imagine that due to subject-matter knowledge, Weibull is believed to be the best model


12

Tolerance Intervals via Reliability/Survival


Weibull 95/95 Lower Bound


13

Weibull 95/95 Two-sided Tolerance Interval

Interval is 2.03 to 61.51 27 | MDT Confidential

Sample size for Weibull Tolerance Interval • See Medtronic Corporate Statistical Resources – Work Aid #2


14

Summary and Review • Tolerance Intervals – Calculations – Sample Size


15

General Linear Models (GLM) I feel like I’m regressing

LeRoy Mattson Jeremy Strief

Objectives • Understand how GLM is a generalization of ANOVA and regression • Understand three primary concepts within GLM models – Fixed vs. Random effects – Nesting vs. Crossing – Covariate (Continuous) vs. Factor (Attribute)

• Fit GLM in Minitab


1

Recap from Quality Trainer • One-Way A NOVA • Two-Way ANOVA • Correlation & Regression


Statistical Tools for Analyzing key Xs X Variables (continuous) ) a t a d s u s o e u l n b i t a Y n i r o a c V ( ) a t a d e e t t u e r b c i r s t t i d A (

Attribute (discrete)

Regression

t-test (1 X, 2 levels)

Multiple Regression

One-way ANOVA

GLM

GLM

Logistic Regression

Chi Square Logistic Regression

4

2

General Linear Models

GLM: Concepts GLM: Variable Y – One Attribute X GLM: Variable Y – Two Attribute Xs GLM: Variable Y – Mixture of Attribute & Variable Xs

GLM Introduction • GLM stands for General Li near Model • A flexible, unified approach to regression and ANOVA. • Needed when building a Y=f(X) transfer function, but when the input variables don’t match a standard regression or ANOVA approach: – Regression assumes continuous X’s – ANOVA treats X’s as attributes, and it often requires a balanced experimental design in Minitab – What if your dataset does not fit into the ANOVA or Regression mold? 6 | MDT Confidential

3

Motivating Example Pin Pulls.mtw

• MECC began collecting data around pull strength for a particular component. • Due to the nature of the investigation and due to resource constraints, it was not possible to execute a formal DOE. • Data were collected over a series of months, and sample sizes were not equally distributed across all the engineering conditions of interest. (So the dataset is unbalanced, in DOE language.)


Motivating Example • Response variable (Y): Pull Strength • Predictor Variables (Xs): – Hole diameter: 17.5, 18.5, or 19.5 – Fillet Style: one-sided or two-sided – Solder size: small or large

• Fillet style and Solder size are attribute metrics • Hole diameter is a variables/continuous metric


4

Data are unbalanced Tabulated statistics: hole diameter, 1 or 2 sided fillet, solder size Results for solder size = 1

Rows: hol e di ameter

17. 5 18. 5 19. 5 Al l

1

2

4 0 4 8

3 0 2 5

Col umns: 1 or 2 si ded f i l l et

Al l 7 0 6 13

Results for solder size = 2

Rows: hol e di ameter

17. 5 18. 5 19. 5 Al l

1

2

9 4 16 29

4 0 12 16

Col umns: 1 or 2 si ded f i l l et

Al l 13 4 28 45


How to Analyze in Minitab? • With multiple X’s of various types, GLM is the only method which can be used to analyze the data in Minitab • JMP also offers flexible modeling platforms through “Custom Design” and “Fit Model”


5

Three Main Concepts in GLM • Predictor variables (Xs) can be characterized in three ways: – Fixed vs. Random effects – Nesting vs. Crossing – Covariate (Continuous) vs. Factor (Attribute)


An Unfortunate Naming Convention • In statistical literature, there are two types of models whose names are confusingly similar. • The General Linear Model is the main topic of today’s talk. – Y is continuous – X can be continuous or categorical

• The Generalized Linear Model is a further abstraction of the General Linear Model. – Y can be continuous or categorical – X can be continuous or categorical – Subcategories of Generalized Linear Models are • Logistic regression for a binary Y • Poisson regression for a count-based Y • General linear model for a continuous Y

• The Advanced SME class will focus on the General Linear Model in Ch 5 and on Logistic Regression in Ch 6. 12 | MDT Confidential

6



Topics to be covered GLM: Variable Y – One Attribute X – One-way ANOVA (review) – GLM approach – Random effect vs. Fixed effect model

14

7

One Attribute X Example •

Project Goal : Reduce late deliveries (>36 hrs.) from suppliers MINITAB®

SupplierLT.mtw

15

One attribute X: Example

Is there a practical difference among several suppliers?

% of Lead Time variance explained by variation in Supplier means

16

8

GLM approach to one attribute X • Model: yij = μ + ai + eij

where i represents factor level for A

17

Minitab Output

Expected lead time for “Blitz” : Y = 35.092 - 7.323(1) = 27.769 Expected time for “Hare” : Y = ? Expected time for “Wild” : Y = 35.092 - 7.323(-1) + 5.023(-1) - 3.134 (-1) + 7.716(-1) + 1.686(-1) = 31.125 18

9

GLM: multiple comparisons

19

Multiple comparison

20

10

Capabilities of ANOVA vs GLM Capability

ANOVA

GLM

Can fit unbalanced data

no*

yes

Can specify factors as random and obtain expected means squares

yes

yes

Fits covariates

no

yes

Performs multiple comparisons

no*

yes

* Except for one-way ANOVA

21

GLM has some limits Just like the one-way ANOVA: •

residuals shoul d be distribut ed normally

•

residuals shou ld not have a pattern when plotted against the predicted Y

•

residuals shou ld not have a pattern when plotted in run order

Just like regression: •

one should check that that factors aren’t hig hly correlated

•

one should simplify the model. 22

11

What are Random Effects? • Random X – X is random factor when levels of X are randomly chosen from a population of possible levels. – Inferences are made on the overall population of Xs, rather than on the specific levels chosen for the experiment. – Random effect models focus on estimating variance components. How much variation in Y is due to X? There is less concern with estimating the mean for any particular level of X. Example: Selecting a random sample of 3 operators and a random sample of 5 parts for Gage R&R Study i n MSA

23

What are Fixed Effects? • Fixed X – The specific levels used in the experiment will be controlled and replicated in a real manufacturing situation. – There are only a few discrete levels of X which are of scientific interest, or there only a few discrete levels of X which can actually be produced in the real world. – We are specifically interested in estimating the mean value of Y for a given value of X.


12

Fixed vs. Random Quiz 1. MECC wishes to understand the impact of two different material suppliers upon weld penetration. Based on the specific performance of each supplier, MECC intends to establish a long-term contract with one or both suppliers. –

Supplier is a _____ effect for the response of weld penetration.

2. In a Gage R&R study, we select three operators from a pool of 30. We are not interested in the specific performance of the 3 operators in the experiment; we wish to understand the variability due to operator. –

Operator is a _____ effect.


Common Examples in Manufacturing • Fixed Effects: – Designs – Suppliers – Material types – Controllable process settings (e.g. laser power, position, etc.)

• Random Effects: – Lots – Operators – Subsampling from a finite population of levels – Noise variables (uncontrollable aspects of a process) 26 | MDT Confidential

13

Random Effect vs. Fixed Effect •

Example: Fiber Strength Data :

•

Model: yij = μ + ai + eij

MINITAB®

Loom.mtw

Var(y) = a2 + 2 Random Effect Model

•

Objective of Random Effect Model:

Estimate

2

a

&

2



27

One-way ANOVA for fiber strength data

na2

28

14

GLM for Random Effect Model MINITAB®

Loom.mtw

29

Compare to the manual results

30

15



Topics to be covered GLM: Variable Y – Two Attribute Xs – Two-way ANOVA – GLM approach – Crossed vs. Nested design

32

16

Example: Two Attribute Xs Problem Statement: Customer service call center staffing often too high (waste) or too low (low customer satisfaction). Project Goal: Improve call center forecast accuracy. accuracy. Accurate forecast is within 20 calls of actual. Path Y: Y: Calls Received Xs

MINITAB®



Day (Monday to Friday)



Shift

Call Center Attribut e XR.mtw XR.mtw

1 (21:00-3:00) 2 (3:00-9:00) 3 (9:00-15:00) 4 (15:00-21:00)

Monday begins at 21:00 on Sunday, etc. 33

ANOVA ANOV A approach

Y

interaction default between Xs

Xs

34

17

ANOVA ANOV A Output • yijk = µ + a i + b j + abij + eijk

35

General Linear Model Approach

Y

Xs

36

18

General Linear Model, cont.

p < 0.05 (Day (Day and Shift are Key Xs)

p < 0. 0.05 05 (Da (Day*S y*Shift hift interaction is signific ant)

37

GLM: Main Effect Plot

38

19

GLM: Main Effect Plot, cont Main Effects Plot (fitted means) means) for Calls Received Rece ived Day

90

Shift

80

Since interaction is significant, these plots do not tell the whole story!

d e v 70 i e c e R s 60 l l a C f 50 o n a e M40

30

Each point is the mean number of calls received for that day

Each point is the mean number of calls received for that shift

20 Mon

Tue

Wed

Thu

Fri



# of Calls received decreases by day of week



# of calls received is lower for 1st shift

1

2

3

4

39

GLM: Interaction Plot

40

20

GLM: Interaction Plot, continued

Interaction = Lines NOT Parallel

Each line is a different day

Each line is a different shift

• Shift 1 appears to have more calls calls on Monday than other days • Since p < 0.05 for Day*Shift, Day*Shift, this observed interaction interaction is significant • Effect of Shift depends on on Day. Day. Effect of of Day depends on Shift. Shift. 41

Example: Statistical Impact of X on Y:

2



•

2

Shift

= 29278.4/39776.0 29278.4/39776.0 = 73.6% of of the variation variation in calls calls received

•

2

Day

= 1473.8/39776.0 1473.8/39776.0

•

2

Day*Shift = 5412.4/39776.0 = 13.6% of the variation variation in calls received

= 3.7% of the the variation variation in calls received

42

21

Exercise: All Xs Attributes Y Variables MINITAB®

Days Overdue.mtw

Project Goal: Improve On Time Delivery to Customer Project Strategy: Path Y = Days Overdue Xs: X1 = Product (1 or 2) X2 = Priority (1 to 4), 1 = Highest Priority, 4 = No Priority Task: App ro ach:

Perform ALL steps of Analyze using the data Work alone or in small groups. 15 Minutes

43

Exercise Debrief Solution: What are the key Xs? What is the relationship between the key Xs and Y What is the impact of the key Xs on Y?

What was difficult?

44

22

Residuals Verify Assumptions Days Overdue.mtw

MINITAB®

45

Residuals Verify Assumptions Verify Equal Variance Assumption (Want no patterns) Residual Plots for Missed Days

Verify Normality Assumption (want fit to line)

Normal Probability Plot of the Residuals 99

t n e c r e P

90 50 10 1 0.1

Residuals Versus the Fitted Values l a u d i s e R d e z i d r a d n a t S

99.9

-4

-2 0 2 Standardized Residual

4

Hist og ram of t he Resid uals

y 18 c n e u 12 q e r F

6 0

-2


2

1.5 0.0 -1.5 -3.0 -15

-10

-5 0 Fitted Value

5

R esid uals Versu s t he Ord er of t he Dat a l a u d i s e R d e z i d r a d n a t S

24

3.0

3.0 1.5 0.0 -1.5 -3.0 1

10

20

46

30 40 50 60 70 Observation Order

80

90 100

Verify Independence Assumption (Want no patterns)

23

Another two attribute Xs example: Gage R&R MINITAB®

Micrometer.mtw

Design : Crossed design Model : Random Effect model

47

Nesting • Factor B is nested in factor A if the levels of B have different meanings for each level of A. • Stated differently, factor B is nested in factor A if there is a completely different set of levels of B for every level of A. • Minitab notation: “B(A)” means B is nested within A.


24

Nesting Example • Example: An experiment is run with three suppliers, each of which produces three batches of material. – There clearly are three levels of supplier, but how many levels of batch are there? – Batch 1 from supplier 1 has nothing to do with batch 1 from supplier 2. Batch “level 1” has no consistent meaning across suppliers. So Batch is nested in supplier. – Instead of labeling the batch levels as 1-3, it would be appropriate to label them 1-9.

• You know that B is nested if A it is reasonable to label each level of B differently, depending on the level of A. 49 | MDT Confidential

Crossing • Factor B is crossed with Factor A if the levels of B have the same meaning for each level of A. • This is the standard factorial structure of a DOE • Example: An experiment is run with three suppliers, each of which utilizes two types of material—100% gold or 100% nickel. – Gold and Nickel have the same meaning and same interpretation, regardless of supplier. – Supplier is therefore crossed with material.


25

Example of Nested Design

•

Levels of Batches nested within levels of supplier – Is this a factorial design? – Can we estimate Supplier X batch interaction?

yijk = µ + a i + b j(i) + ek(ij) 51

Nested Design - continued • Company buys raw material in batched from 3 different suppliers. The purity of this material varies considerably. Which causes problems in manufacturing the finished product. We wish to determine if the variability in purity is attributable to difference between the suppliers.Four batches of raw material are selected at random from each supplier, and 3 determinations were made on each batch. MINITAB®

Purity.mtw

52

26

ANOVA for the Purity data yijk = µ + ai + b j(i) + ek(ij)

A = Fixed or Random ?,



Is Supplier a key X?







= 1.62



batch

B = Fixed or Random ?

=?

Is there differences among suppliers? 53

Incorrect GLM Analysis Supplier and batch fixed effects Two-way ANOVA: purity versus supplier, batch

Source

DF

SS

MS

F

P

supplier

2

15.056

7.52778

2.85

0.077

batch

3

25.639

8.54630

3.24

0.040

Interaction

6

44.278

7.37963

2.80

0.033

Error

24

63.333

2.63889

Total

35

148.306

S = 1.624

R-Sq = 57.30%

R-Sq(adj) = 37.72%

54

27

GLM Exercise: MINITAB®

(Purity.mtw)

Is supplier a key X?



Assume that suppliers were randomly chosen (i.e., random effect), and estimate supplier using GLM. 55

Summary: Different Types of Xs 

I X at a time:

F/R



2 or more Xs at a time: F/R C/N

F = Fixed

C = Crossed

R = Random

N = Nested

56

28

Specifying the Model Terms in Minitab

example

Statistical model

Terms in model

Factor A, B crossed

Yijk= μ+ ai + b j + abij + eijk

A, B, A*B

Crossed and nested (B nested within A, both crossed with C)

Yijkl = μ + ai + b j(i) + ck + ac jk + bc jk(i)

A, B(A), C, A*C, B*C

+ el(ijk)

57

Exercise MINITAB®

Time.MTW

Problem Statement: Assembly time is too long for a manufacturing process. Type of layout and type of fixture are suspect Xs for assembly lead time. Two (2) different layouts and three (3) different fixtures are to be tested. Two(2) groups of 4 Operators each are randomly selected to test each layout with the 3 fixtures, two times. All factorial combinations of layout and fixture are completely randomized in the experiment. Task: Are type of fixture and layout key Xs for assembly time? Experimental Design: L1 O1 O2 O3 O4

L2 O5 O6 O7 O8

F1 F2 F3 Time: 20 minutes 58

29



Topics to be covered GLM: Variable Y – Mixture of Attribute and Variable Xs – GLM with Covariates – Strategic GLM

60

30

When Can I Treat an X as Variables?  When

relationship between X and Y can be described with a line or curve

 Number

of levels does not determine Variables X vs Attribute X

Attr ib ute (Facto r) X

Variables (Covari ate) X 7

Coffee Taste

6

) s 30 y a D ( 25 e 20 m i T 15 d a 10 e L 5

Curve Y=F(X)

e t s 5 a T

Actual data

4

quadratic

3 0

10

20

30

40

Lead Time vs Supplier

35

50

60

1

2

3

4

5

6

7

8

9

10

Supplier

Brew To Serve Time

X = Brew To Serve Time has 3 levels (1, 30, 60)

No Line or Curve Y=F(X)

Actual data

X = Supplier has 10 levels (1 to 10) 61

GLM for Mixture of Attribute and Variable Xs •

Specify variable Xs as covariates

Example: MINITAB®

Catapult Multipl e X.mtw Are th ere sig ni fican t main effects? interactions? curvature?

62

31

Analyze Centering Xs - Main Effects

Covariates tells MINITAB which Xs are variables 63

Analyze Xs - Main Effects Sour ce Rub Band Shot Oper at or Bal l Ti me PB PB Angl e Er r or Tot al

DF 3 1 1 2 1 1 37 46

Seq SS 3784. 8 8. 3 55. 7 684. 3 336. 8 10108. 2 4957. 1 19935. 2

Adj SS 5991. 3 61. 3 40. 2 572. 2 1. 4 10108. 2 4957. 1

Adj MS 1997. 1 61. 3 40. 2 286. 1 1. 4 10108. 2 134. 0

F 14. 91 0. 46 0. 30 2. 14 0. 01 75. 45

P 0. 000 0. 503 0. 587 0. 133 0. 919 0. 000

Rub Band, PB Angle are sign ificant Main Effects Ball is close (include it for now ) Remember: p-values will change when terms are added or deleted from model

64

32

Reduce Terms Edit Last Dialog

Tells MINITAB to give coefficients for Attribute as well as Variables Xs

65

Reduce Terms Sour ce Rub Band Bal l PB Angl e Er r or Tot al

DF 3 2 1 40 46

Seq SS 3784. 8 691. 8 10348. 8 5109. 8 19935. 2

Adj SS 5988. 0 681. 2 10348. 8 5109. 8

Adj MS 1996. 0 340. 6 10348. 8 127. 7

F 15. 63 2. 67 81. 01

P 0. 000 0. 082 0. 000

Ball p-value smaller with Shot, Operator, Time PB removed from model

Should we keep Ball?

66

33

What If We Treat PB Angle as Attribute?

Source Rub Band Bal l PB Angl e Err or Tot al

DF 3 2 4 37 46

Seq SS 3784. 8 691. 8 12097. 9 3360. 7 19935. 2

Adj SS 5212. 1 900. 0 12097. 9 3360. 7

Variable DF Adj SS

1

Adj MS 1737. 4 450. 0 3024. 5 90. 8

P 0. 000 0. 012 0. 000

Attribute 4

10348.8

12097.9

F

81.01

33.30

p

0.000

0.000

40

37

Error DF

F 19. 13 4. 95 33. 30

67

What If We Treat PB Angle as Attribute? Ter m Const ant Rub Band 1 2 3 Bal l Gol f Wi f f l e PB Angl e 130 140 150 160

Coef 99. 896

SE Coef 1. 531

T 65. 24

P 0. 000

- 14. 661 19. 518 3. 354

2. 725 2. 942 2. 597

- 5. 38 6. 63 1. 29

0. 000 0. 000 0. 204

7. 570 - 5. 102

2. 664 2. 031

2. 84 - 2. 51

0. 007 0. 016

- 32. 106 - 4. 533 7. 497 9. 455

3. 035 2. 920 3. 129 3. 669

- 10. 58 - 1. 55 2. 40 2. 58

0. 000 0. 129 0. 022 0. 014

Model Prediction (Rub Band = 1, Ball = Wiffle, PB Angle = 150) Distance = 99.896

-14.661

-5.102

+7.497

= 87.63

Model Prediction (Rub Band = 4, Ball = Golf, PB Angle = 180) Impossible: Can only get PB Angle predictions for 130,140,150,160,170 68

34

Interactions

If sample size is small try interactions one at a time

69

Interactions Sour ce Rub Band Bal l PB Angl e Rub Band*Bal l Er r or Tot al

DF 3 2 1 6 34 46

Seq SS 3784. 8 691. 8 10348. 8 187. 5 4922. 3 19935. 2

Adj SS 4423. 8 502. 4 8766. 5 187. 5 4922. 3

Adj MS 1474. 6 251. 2 8766. 5 31. 2 144. 8

F 10. 19 1. 74 60. 55 0. 22

P 0. 000 0. 192 0. 000 0. 969

Rub Band * Ball Interaction not significant Note: 6 DF (degrees of freedom) for Rub Band*Ball = 3 * 2 34 DF left for for Error = 46 - 3 - 2 - 1 - 6 If DF for Error decreases then p values increase If DF for Error < 0 then no p values are possible (MINITAB will complain!) Conclusion: Be careful when adding interactions (DF for Error may reach 0) 70

35

Interactions p-values for Interactions

Rub Band

Ball

Ball

0.969

-

PB Angle

0.566

0.211

Conclusion: No significant interactions

Should we test interactions with Shot? Operator? Time PB? 71

Review: Linear vs Curvature Main Effects Plot (data means) for Taste

Curvature only applies to variables Xs!

5.5

Point Type Corner Center

5.0

Curvature Model

e t s 4.5 a T f o n a 4.0 e M

3.5

Linear Model

3.0 1.0

30.5

60.0

Brew to Serve Time

Quadratic Model: Y = aX 2 + bX + c 72

36

Curvature - Detecting with Residuals

Plot Residuals vs PB Angle to graphically check for curvature 73

Curvature - Detecting with Residuals Residuals Versus PB Angle (response is Distance) 3 l 2 a u d i s e R 1 d e z i d r a 0 d n a t S

-1

-2 130

140

150 PB Angle

160

170

Looks like curvature Now lets prove it!

74

37

Curvature - Add X2 Term to Model

(PB Angle) 2

75

Curvature - Add X2 Term to Model Note: Ball main effect is now significant

Source Rub Band Bal l PB Angl e PB Angl e*PB Angl e Err or Tot al

DF 3 2 1 1 39 46

Seq SS 3784. 8 691. 8 10348. 8 1360. 4 3749. 3 19935. 2

Adj SS 6036. 3 1060. 9 1685. 1 1360. 4 3749. 3

Adj MS 2012. 1 530. 5 1685. 1 1360. 4 96. 1

F 20. 93 5. 52 17. 53 14. 15

P 0. 000 0. 008 0. 000 0. 001

(PB Angle)2 is significant

76

38

Final Model - Check Residuals We now have all terms for our model. Need to check residuals to understand how good the model is?

77

Final Model - Check Residuals Brush over to find which point is causing trouble!

Residual Plots for Distance Normal Probability Plot of the Residuals

Residuals Versus the Fitted Values l 4 a u d i s e R 2 d e z i d r 0 a d n a t S -2

99 90 t n e c 50 r e P

10 1 -2

0 2 Standardized Residual

4

60

Hist og ram of t he R esid uals

80

100 Fitted Value

120

140

Resid uals Versu s t he Ord er of t he Dat a l 4 a u d i s e R 2 d e z i d r 0 a d n a t S -2

16 y 12 c n e u 8 q e r F

4 0

-1

0 1 2 3 Standardized Residual

4

1

5

10

15 20 25 30 35 Observation Order

40

45

What do we conclude? 78

39

Exercise •

Twelve steel brackets were randomly divided into three groups and sent to three vendors to be zinc plated. The chief concern in this process is whether or not there is any difference in zinc thickness among vendors. The following table lists the plating thickness (Y), as well as the thickness of the bracket (X), in hundred-thousandths of an inch.

MINITAB®

Zinc plating.mtw

79

Exercise : Questions 1) One-way ANOVA : X = vendor Is there significant differences among vendors? 2) GLM: X1= vendor, X2 = Bracket Thickness How does this change the conclusion? 3) Bonus Questions: If you are to do this testing again, what would you do differently? Use a graphical tool to support your rationale (Suggestion: try Interaction Plot under ANOVA)

80

40

Problems with Designs: Correlated Xs

81

Return to MECC Example Pin Pulls.mtw • Response Variable (Y): Pull Strength • Predictor Variables (Xs): – Hole diameter: 17.5, 18.5, or 19.5 – Fillet Style: one-sided or two-sided – Solder size: small or large

• Exercise: – Fit a GLM to create a model for pull strength – Can Hole diameter be reasonably treated as a covariate? (Engineering theory suggests that it can.) – Determine if variables are fixed vs. random, crossed vs. nested – Which X’s are statistically significant? 82 | MDT Confidential

41

Summary And Recap • Understand how GLM is a generalization of ANOVA and regression • Understand three primary concepts within GLM models – Fixed vs. Random effects – Nesting vs. Crossing – Covariate (Continuous) vs. Factor (Attribute)

• Fit GLM in Minitab


42

Logistic Regression I still feel like I’m regressing LeRoy Mattson

Objectives • Understand how logistic regression creates a predictive model for an attribute Y • Fit logistic regression models in Minitab


1

Logistic Regression

Logistic Regression – Attribute Y, One X Logistic Regression – Attribute Y, Multiple Xs

Attribute Y Data Types   

Individual unit categorized into a classification Finite number of possible values Cannot be subdivided meaningfully

4 Attribute data types:    

Binary (pass/fail, good/bad) Nominal (complaint codes, problem type) Ordinal (low/medium/high, mild/moderate/severe) Discrete(# errors)

4

2

Is Smoking (X) a key X for Lung cancer (Y)? Y: Lung Cancer; Yes, No X: Smoking; Yes, No

X\Y

Lung Cancer

No Lung Cancer

Total

Smoker

2

3

5

Non-smoker

1

8

9

Anal ysi s t oo ls : Relat ive Risk or Odd s Rati o

The Relative Ris k of lung cancer for

smoker vs non-smoker = (2/5)/(1/9) = 3.6

5

Concept: Odds Ratio (OR) as a Measure of X Impact for Attribute Y OR = Odds of Y outco me in one group relative to another group = Odds of cancer for smokers odds of cancer for non -smokers OR =

2/3 1/8

=

0.67 0.125

= 5.33

Interpretation of Odds Ratio: 5.33 • Odds of cancer for smokers is 5.33*odds for non-smokers • Chance of getting cancer is increased 433% with smoking

6

3

Statistical Tools for Analyzing key Xs X Variables (continuous) ) a t a d s u s o e u l n b i t a i n r o a c V ( ) a t a d e e t t u e r b c Y i r s t t i d A (

Attribute (discrete)

Regression Multiple Regression GLM

t-test (1 X, 2 levels) One-way ANOVA GLM

Logistic Regression

Chi Square Logistic Regression

7

Example – Binary Logistic Regression Attribute X Problem Statement: Is smoking associated with disease in previous example? Y? X? MINITAB® Smoking.MTW Task:

In this module… 1) What tool(s) for Hypothesis Test? 2) What tool(s) for Graphical Analysis?

App ro ach :

Work individually.

Y = Cancer o r Cancer-Free X = Exposure (smoking/nonsmoking)

8

4

Logistic Regression Analysis

9

Logistic Logis tic Regression Regression Analy Analysis sis – cont cont.. Wald Test to Verify Key X:

If p-value < 0.05, X is Key (Smoking is not Key X)

OR for Attribute X Impact: OR of disease for exposed relative to un exposed = 5.33 5.33 (433% (43 3% increase in odds of di sease for smoking r elative to nonsmoking) 10

5

Exercise Exerci se – Binary Logistic Regression Regression Attribut Attribute eX Problem Statement: Statement: A new data data set on smoki ng has been collected. Anal yzed th is d ata set and det determ erm in e if s mo ki king ng has an effec ef fectt on on cancer. MINITAB®

Smoking2.MTW

11

Example – Binary Logistic Logistic Regression Variable X Problem Statement: Toy company is interested in whether a toy missile will hit flying targets of varying speeds. Y? X? MINITAB® Speed.MTW Task:

In this module… 1) What tool(s) for Hypothesis Test? Test? 2) What tool(s) for Graphical Analysis?

App ro ach :

Work individu individually. ally.

Y = Hit or Mis s? (1/0) X = Target Target s peed

12

6

Incorrect Analysis : Variable ariables sY Fitted Line extends beyond 0 and 1 Fitted Line Plot hit or miss = 1.562 - 0.003005 target speed (cms/sec) S R -S q R-Sq(adj)

1.0

Heteroscedastic (Unequal) Variances

0. 397278 41. 8% 39.3% 3%

0.8

s s 0.6 i m r o t 0.4 i h

Residual Plots for hit or miss Normal Probab robabili ility ty Plot Plot of the Residu esidual als s

90

t n e c r 50 e P

0.0 200

250

300 350 400 target speed (cms/sec)

450

500

10 1 -2


2

0.00

Histogra togram of of the the Re Residua d uals

4.5 y c n e u 3.0 q e r F 1.5

-2

0.25

0.50 Fitted Value

0.75

1.00

Residua d uals Versus the Orde rder of the the Data ata l a 2 u d i s e 1 R d e z 0 i d r -1 a d n a -2 t S

6.0

0.0

Resi Residua duals ls Versus ersus the Fitted itted Values l a u 2 d i s e 1 R d e z 0 i d r -1 a d n a -2 t S

99 0.2


2

2

4

6

8 10 12 14 16 18 20 22 24 Observation Order

13

Correct Analysis Analysis : Use Binary Logisti c Regression Regression Whatt if w e analyze Wha analyze propo rtion of hit s? Logit (p)

Proportion p vs X 0.9

b1 < 0

0.8

Logit(p) vs X 3

logi t (p) = b 0 +b 1X

2

0.7

) 0.6 p ( n o 0.5 i t r o p 0.4 o r p

1

) p ( t i g 0 o l

0.3

-1

0.2 -2

0.1 0.0 200

250

300

350 X

400

450

200

500

p(x) = proport proportion ion of Y-attributes at Logit transformation straightens

250

300

350 X

400

450

500

each X value S-shape to straight line

Logit(p) = loge[(p/(1-p)] Logistic f(x) f(x):: log e [p(x)/(1-p(x))] = b 0 +b 1X Origins: Verhulst (mathe (mathematician) matician) named the logist ic f unction (18 (183838-184 1847: 7: 3 papers). Pearl and Ree Pearl Reed d (1920, (1920, Johns Hopki ns, Biometry and Vital Statistics) rediscovered Logistic to model population growth in US 14

7

Binary Logistic Regression

Does target speed affect hit or miss?

Raw data (0,1)

Declare Attribute Xs

Fitted probabilities stored as EPRO1 Default: Defa ult: Event is “ 1” 15

Identif Ide ntif ying Key Xs Link Function: Logit

Wald Test to Verify Key X:

Response Information Variable

Value

Count

hit or miss

1

13

0

12

Total

25

If p-value < 0.05, X is Key (Event)

(Target speed is Key X)

Logistic Regression Table Odds Predictor Constant target speed (cms/sec)

Coef

SE Coef

Z

P

5.56028

2.04130

2.72

0.006

-0.0156619

0.0055920

-2.80

0.005

Ratio

0.98

95% CI Lower Upper

0.97

1.00

Log-Likelihood = -11.411 Test that all slopes are zero: G = 11.796, DF = 1, P-Value = 0.001

Compare with GLM: Wald test in Logistic is simi lar to t-test in GLM 16

8

Measuring Me asuring X Impact – Odds Ratio Ratio for X (OR) (OR) Link Function: Logit

OR for Variables X Impact:

Response Information Variable

Value

hit or miss

1

13

0

12

Total

25

OR for a c uni t incr ease in X = e(c*b1) = (Odds

Count (Event)

Ratio Ra tio for 1 unit X inc rease rease))c (Need to determine meaningful c)

Logistic Regression Table Odds Predictor Constant target speed (cms/sec)

Coef

SE Coef

Z

P

5.56028

2.04130

2.72

0.006

-0.0156619

0.0055920

-2.80

0.005

Ratio

0.98

95% CI Lower Upper

0.97

1.00


For 50 unit i ncrease in target speed, risk (chance) of h itting target = (0.98) (0.98) 50 = 0.46 (i.e., a 54% reduction) 17

Graphical Gra phical Analysis – Plot logistic curve

18

9

Graphical Analysis – Plot logistic Curve Check against raw data

Scatterplot of hit or miss, EPRO1 vs target speed (cms/sec) Variable hit or miss EPRO1

1.0

0.8

Fitted Logistic Curve 0.6

a t a D Y 0.4 0.2

0.0 200

250

300 350 400 target speed (cms/sec)

450

500

Target speed = 350, 50% chance of hitting target 19

Exercise: Binary Logistic Regression Problem Statement: Chemotherapy induced remission rate takes too long

to measure and is inaccurate causing delays and errors in cancer research. Project Strategy: Determine if labeling index is a variables path Y for

remission rate. Labeling index measures the proliferative activity of cells after a patient receives an injection of thymidine as part of chemotherapy. It represents the percentage of cells that are “labeled” (Lee, 1974). MINITAB®

Cancer Remissi on.MTW

Task: App ro ach : Time:

(from Lee*)

Verify if labeling index is a suitable Path Y for remission rate. Work individually or in pairs 10 minutes

* Lee (1974): A computer p rogram for linear logi stic regressio n analysis. Computer Prog. Biomed. 4: 80-92.

20

10

Exercise Debrief Solution:

1.

Is labeling index a Key X?

2.

What hypothesis test did you use to verify the key X?

3.

Compare the results from fitted line plot.

4.

Is impact of labeling index large enough to use as a Path Y for remission rate?

What did you learn?

21

Logistic Regression

Logistic Regression – Attribute Y, One X Logistic Regression – Attribute Y, Multiple Xs

11

Example: Multiple Binary Logistic Regression A cancer study showed the number of cases of esophageal cancer, classified by age group and alcohol consumption (0=none, 1=some). Y? Xs? Data type? MINITAB®

EsophagealCancer.MTW

Task: Verify if age group and alcohol

consumption are key Xs for incidence of esophageal disease. Alcohol is Attribute X 23

Study Effect of alcohol consumption/age on cancer cases

Fitted Line Plot cancer % = 14.40 + 0.1286 age S R-Sq R-Sq(adj)

30

8.45436 9.2% 0.0 %

25

% r e c 20 n a c 15

10 20

30

40

50 age

60

70

80

24

12

Are Xs correlated (potential confounding of Xs) Scatterplot of alc% v s age 90 80

Correlated Xs (corr = 51%)

70 60

% c 50 l a 40 30 20 10 20

Rows: Alcohol 25 0 6 18.18 1 27 81.82 All 33 100.00

30

40

50 ag e

Columns: Age Group 35 45 55 3 30 9 14.29 83.33 30.00 18 6 21 85.71 16.67 70.00 21 36 30 100.00 100.00 100.00

60

65 27 81.82 6 18.18 33 100.00

75 18 66.67 9 33.33 27 100.00

70

All 93 51.67 87 48.33 180 100.00

80

% alcohol use depends on age-group.

25

Example: Multiple Binary Logistic Regression A cancer study showed # cases of esophageal cancer, classified by age group and alcohol consumption (0=none, 1=some). Y? Xs? Data type?

Alcohol is Attribute X

MINITAB®

EsophagealCancer.MTW

Task: Verify if age group and alcohol

consumption are key Xs for incidence of esophageal disease.

26

13

Example – Hypothesis Test OR for Attribu te X Impact = Odds Ratio Logistic Regression Table

Predictor Constant Alcohol 1 Age Group

Coef -2.72159

SE Coef 0.753215

Z -3.61

P 0.000

0.733554 0.0187340

0.406715 0.0119209

1.80 1.57

0.071 0.116

Odds Ratio

95% CI Lower Upper

2.08 1.02

0.94 1.00

4.62 1.04


Risk (chance) of getting

Alcohol is Key X

cancer increases by 108% with alcohol use

27

Graphical Analysis – Raw data and Logistic Regression Estimates

Scatterplot of EPRO1 vs Age Group Alco hol 0 1

0.35

0.30

1 0.25 O R P E

Fitted Logistic Model

0.20

0.15

0.10 20

30

40

50 Age Group

60

70

80

28

14

Exercise: Logistic Regression Problem Statement: A sample of ingots are treated with four levels of heat

time and five levels of soak time. The response is number of ingots ready to be rolled (out of those tested) for each combination of times. Project Goal: maximize the ingots ready to be rolled. MINITAB®

Task: App ro ach : Time:

Ingots.MTW

Verify if heat time and soak time Key Xs. Work individually or in pairs 10 minutes

29

Summary Quiz

True or False

___________

Use Binary logistic regression when Y is Variables

___________

Odds ratio is odds of Y outcome in one group relative to another group

___________

Use GLM analysis when Y is attribute at 2 levels

30

15

Statistical Resources

Avoiding wheel re-invention LeRoy Mattson

Objectives • Ensure you are aware of statistical resources both internal and external to Medtronic: – Medtronic Statistical Resources Web Site – External Web Sites

• This chapter can serve as a reference document after the class is complete.


1

Medtronic Statistical Resource Web Site

http://mitintra.corp.medtronic.com/corporate-statistics/


Software Validation Plans & Reports

For links to Validation Plans & Reports: Click on Search button on Web Site • For Medstat Plans/Reports: Enter “Medstat validation” • For Minitab Plans/reports: Enter “Minitab validation” • For Crystal Ball Validation Plans/Reports :Enter ”Crystal Ball validation” Note: These links are to pdf documents stored in Documentum.


2

About Corporate Stats


About Corporate Stats Cont.


3

Get Trained


Recap from Quality Trainer • If you comp lete all o f the Quality Trainer , Minitab wil l send you a Certif icate . It takes 20-40 hour s t o c ompl ete all of the QT.


4

Get Trained Cont.


Tools/Resources: Software


5

Tools/Resources: Minitab16 Validation


Tools/Resources: Work Aids


6

Tools/Resources: MHOS Medtronic Handbook of Statistics - Rev G. : in pdf format only


Tools/Resources: Business SOPS Business Unit procedures for: Test Method Validation (MSA included) Normality Testing Lot Acceptance (or sampling plans for incoming) SPC


7

Tools/Resources: Other Software

Miscellaneous


Tools/Resources: JMP Software • Contact Kevin Gaffney at MECC if you are int erested in obtain ing JMP.

• The software has been officiall y validated and may be used within th e quality system. • JMP tends to be mor e interactive than Minitab and is m ore pow erful for certain appli cations (e.g. advanced DOE). • JMP is point-and-click like Minitab, but it is more “ objectoriented” instead of “ menu-oriented.”


8

Get Connected


Get Connected Industrial Statistics Questions? 1) Contact your division’s Industrial Statistics Council member

2)Otherwise, contact Medtronic Statistical Resources


9

External Web Sites Statistical Standards : can be purchased as part of a CD Rom collection available at http://www.iso.org/iso/pressrelease.htm?refid=Ref1134. ISO Standards for Statistical Methods

ASTM Stan dar ds on pr ecision and Bias 6 th edition http://www.astm.org/BOOKSTORE/COMPS/BIAS08.htm

ASTM SPC Stand ard http://www.astm.org/Standards/E2587.htm


External Web Sites Statistical Standards: ASQ has ANSI/ASQ stan dar ds : http://asq.org/quality-press/display-item/index.html?item=T004 GHTF has standards (link to GHTF proc ess validation below ) http://www.ghtf.org/sg3/sg3-final.html AIAG h as Guid ance for MSA & SPC Publications Catalog - Automotive Industry Action Group Large list of Acceptance Sampling Standards : http://variation.com/techlib/standard.html

Has list of acceptance sampling stats standards: MIL-STD & ANSI & ISO Bulk sampling & reliability are listed.


10

External Web Sites Statistical Committees: ISO Statistics Technical Committee: TC 69 - with six subcommittees http://www.iso.org/iso/home/standards_development/list_of_iso_technical_c ommittees/iso_technical_committee.htm?commid=49742

The Six ISO TC69 Subcommittees

ASTM Techni cal Com mittee E11 (Stati stics) http://www.astm.org/COMMIT/COMMITTEE/E11.htm USP Expert Statistics Committee

http://www.usp.org/council-experts-expert-committees-overview/expertcommittees/statistics


External Web Sites Handbooks: NIST E-Statistics Handbook (has hyperlinks) http://www.itl.nist.gov/div898/handbook/


11

Advanced Statistics Manual PDF

Recommend Documents