What is Distribution Analyzer?

Contents - Index

What is Distribution Analyzer?

Distribution Analyze is used to test whether a set of data fits the normal distribution and, if not, to determine which distribution best fits the data. Associated with each distribution is a transformation that, when applied to the data, will convert data from that distribution to the normal distribution. Once the data is transformed to the normal distribution, Distribution Analyzer constructs confidence statements like the following:

With 95% confidence more than 99% of the values are between 17.8 and 23.2 pounds (normal tolerance interval)

With 95% confidence more than 99.34% of the values are within the specification limits (variables sampling plan)

Distribution Analyzer is specifically designed to aid in the execution of validation/verification/qualification studies designed to make a claim about the performance of a product or process. However, it can be used anytime one wants to test and fit distributions. It is also a valuable learning tool concerning distributions, their relationships and properties.

USES:

1. Validation/Verification/Qualification Studies: Such studies commonly use normal tolerance intervals and variables sampling plans to make confidence statements like the ones above. Both these procedures make the assumption that the data fits the normal distribution. These two procedures are particularly sensitive to departures from normality so it is advised that a formal test for normality be applied to the data before applying either. Distribution Analyzer contains a specific test designed to detect departures from normality that invalid the use of these two procedures. This test allows one to proceed in many cases that traditional normality tests would fail. If the data is not normal, a transformation may be available that can be applied to the data to make it normal. Distribution Analyzer not only tests for normality and determines the best transformation, it constructs the normal tolerance interval and variables sampling plan confidence statements. Further it does additional analysis like checking for outliers, looking for time order effects and looking for differences between groups to aid in determining why a set of data fails normality.

2. Test and Fit Distributions: Distribution Analyzer has the ability to fit a wide range of distributions using both the method of moments and maximum likelihood methods. Both methods have been modified to ensure the distribution covers a specified range. Distribution Analyzer has simplified the whole process of comparing and fitting distributions by characterizing all distributions in terms of their moments (average, standard deviation, skewness, kurtosis) rather than using a different set of Greek letters for each distribution.

3. Learning Tool: Distribution Analyzer can be used to learn about the many distributions and their relationships. Included is a skewness-kurtosis plot for understanding the range of shapes each distribution can fit and the relationships between the different distributions. For each distribution you can view density plots, calculate probabilities and explore the effect of changing the parameters. Finally, you can generate random values for any of the distributions as well as for dice experiments to experience handling different types of data.

CAPABILITIES:

Distribution Analyzer has numerous capabilities, many not found elsewhere:

1. Robust and Specific Tests for Normality: While Distribution Analyzer contains the traditional Anderson-Darling and Shapiro-Wilks tests, it contains two equally powerful tests designed to overcome shortcomings of these two tests. The first is the Skewness-Kurtosis All test that, like the previous two tests, tests for all departures from normality. However, this test is not adversely affected by ties in the data, as are the other two tests. The second is the Skewness-Kurtosis Specific test. This test is designed to detect those departures from normality that invalidate the use of a normal tolerance interval or variables sampling plan. Specifically, they are designed to detect tails that are heavier than the normal distribution. It does not reject when the tails are equal to or less than the normal distribution so that the confidence statements remain valid, although potentially conservative.

2. Improved Methods of Fitting Distributions: In determining which distribution best fits your data, Distribution Analyzer uses both the method of moments and maximum likelihood methods. Both methods have been modified to ensure the distribution covers a specified range. This avoids the issues of spec limits being outside the bounds of the fitted distribution making transformation of the spec limits impossible. A wide range of distributions are covered:

Beta Distribution

Exponential Distribution

Extreme Value Distributions (Smallest Extreme Value, Largest Extreme Value, Weibull, Fréchet)

Gamma Distribution

Johnson Family of Distributions

Logistic and Loglogistic Distributions

Lognormal Distribution

Normal Distribution

Pearson Family of Distribution (Includes Inverse Beta and Inverse Gamma)

Uniform

Not only are the above distributions covered, but the negative of these distributions are included to facilitate the fitting of data with negative skewness. Further, if there are physical bounds (like zero), these bounds can be used to pre-transform the data to the unbounded case. This effectively expands the above list of distributions to include the Log-Beta, Log-Pearson and much more. Distribution Analyzer automates the whole process by letting you click a button "Fit Best Distribution". However, you also have the ability to completely control the selection process including the distribution fit, the method of fitting and whether to pre-transform.

3. Integrated Supporting Analysis: Data can fail a normality test for a variety of other reasons including a shift over time, a mixture of different groups and the presence of outliers. Whenever testing that a distribution fits the data, the data is automatically checked for outliers, shifts over time and differences between groups and the user notified if anything of importance is detected.

4. Detailed Information About Each Distribution: For each distribution you can view density plots, calculate probabilities and explore the effect of changing the parameters. You can view the distribution on a skewness-kurtosis plot for understanding the range of shapes each distribution can fit and its relationships with other distributions.

5. Generating Data: Random numbers from any of the distributions can be generated. You can also use Distribution Analyzer to perform simulated dice experiments to illustrate the types of physical phenomena that create different distributions.