STAT-18: Statistical Techniques for Normality Testing and Transformations

This is part of a series of articles covering the procedures in the book Statistical Procedures for the Medical Device Industry.

Purpose

To provide guidance on normality testing to ensure the assumption of normality is adequately met when using variables sampling plans and related procedures. Related procedures include normal tolerance intervals, variables confidence limits for the proportion, and confidence statements for Ppk.  Included are procedures to handle situations when the normality test fails and procedures for the detection and handling of outliers.

Appendices

  1. General Normality Tests
  2. Effects of Ties on General Normality Tests
  3. Skewness–Kurtosis Specific Normality Test
  4. Transforming Data
  5. Sublotting Data using ANOVA
  6. Sublotting Data using Kruskal–Wallis Test
  7. Investigating Outliers
  8. Invalidating an Outlier Value by Repeated Retesting
  9. High Capability Acceptance Criteria

Highlights

Banding Pattern Due to Ties:  If you see a normal probability plot that looks like that below, the bands are caused by the same value being repeated multiple times (ties).  When this happens both the Anderson-Darling and Shapiro-Wilk test will falsely reject normal data (p-value = 0.016).  The histogram of the data certainly looks normal.  Appendix B describes this problem.  The solution is to use the SK All (D’Agostino–Pearson) test, which is robust to ties in the data.  The SK All test is described in Appendix A and available in the validated spreadsheet STAT-18 – Skewness-Kurtosis Normality Tests accompanying the book.

Bandling Pattern
  • Bounded by the Normal Distribution:  Not all departures from normality invalidate the use of a variables sampling plan.  The data below fails the general normality tests due to short tails (Anderson-Darling p-value 0.0001).  The data has good capability and seems to be bounded by the normal distribution.  The only thing keeping it from passing is the failed normality test.  The SK Specific test has been designed for this purpose.  It asks the question “Can a variables sampling plan be used?” rather than “Is the data normal?”  It accepts certain departures from normality that do not invalidate the confidence statement associated with the variables sampling plan.  The SK Specific test passes this data allowing the study to pass  The SK Specific test is described in Appendix C and available in the validated spreadsheet STAT-18 – Skewness-Kurtosis Normality Tests accompanying the book.

Short Tails

  • High Capability Data:  When data has high capability, a normality test may not be required.  Suppose the desired confidence statement is 95%/99% and the plan n=50, Ppk=0.96 was selected.  The data below fails the normality test.  However, since the estimated Ppk is more than 1.84 times the acceptance criteria of 0.96 and the skewness is greater than -2, the high capability acceptance criteria are meet and no normality test is required.  The high capability acceptance criteria are described in Appendix I

High Capability

  • Flowchart:  The procedure provides numerous options for handling nonormal data including transforming (Appendix D), sublotting (Appendices E, F), and invalidating outliers (Appendices G, H).  It provides a step-by-step flowchart for deciding which approach to use when.  The flowchart and instructions are important in avoiding the abuse of the these methods to prevent “analysis until it passes”.

Leave a Comment

Your email address will not be published. Required fields are marked *