# Statistical Evaluation

Statistical inferences can be made when comparing sensitivity testing results. In several instances there is a need to compare the results of two sensitivity test results and draw conclusions from such. A couple of examples of such situations include using sensitivity testing to determine sample differences or to determine machine, operator, or site repeatability. Discussed here are two methods to compare sensitivity testing results: Significance Chart Method and PROBIT.

Comparison between the characteristic responses (initiation probability versus impetus energy or the number of reactions in a given number of trials) of two materials is valuable for use in decision making. One material may have a different slope relating energy to reaction probability or a different number of reactions for a given number of trials but that difference may or may not be statistically significant. If the difference is not statistically significant, making conclusions to the contrary may lead to negative consequences. For example, a new formulation may appear to be more sensitive than the previous formulation but statistically it may be inconclusive whether or not it’s different. Modifying the formulation could be a costly and incorrect response to such a result.

Before discussing the two methods, it’s important to understand the variability with binomial trials. Binomial trials are those where the outcome is either a success (no reaction) or failure (reaction). An example is flipping a coin (in a perfectly random way). The probability of flipping a head with an unbiased coin is 50%, yet if a coin is flipped randomly ten times, the result could be anywhere from 0 heads to 10 heads, although it’s likely (95% probable) that between 2 to 8 heads will be observed. It cannot be concluded that because only 2 heads in 10 trials were observed that the coin was biased. It likely would take significantly more trials than 10 to identify a biased coin.

In the discussion presented here, any variability in operating the testing equipment (such as variability due to machine inconsistencies or operator or environmental conditions) is not included. Although this undoubtedly could be a significant factor, the effect on the variability of the sensitivity testing is not addressed here other than stating that if a significant difference is observed between samples, the difference may be due to operator or equipment inconsistencies. This discussion treats the best case where the operational variability (differences between operators, reaction determination, and machine operation) is limited or insignificant.

## Chart Significance Method (Developed by SMS)

Perhaps the quickest way to evaluate two energetic samples to determine similarities or differences in sensitivity is to test at a given energy. At a given energy, Sample A may yield 3 of 20 reactions whereas Sample B gives 9 of 20 reactions. From these results are the two samples different? Often it’s concluded that they are different. The Chart Significance Method (developed by Safety Management Services, Inc.) gives a statistical based answer to the above question. The method applies to constant energy binomial testing.

As hinted at above in the introductory section, there is a distribution of outcomes resulting from binomial testing. We have observed persons in industry make conclusions where that inherent distribution is not properly weighted. The Chart Significance Method makes it easy to successfully do so. Table 1 below makes it easy to compare the results from testing of two 20 trial samples at a given energy. Likewise Table 1 is for two 10 trial samples.

**Table 1 Matrix** of significance for 2 sets of 20 binomial trials Unique p-values given are an average of two Monte-Carlo calculations of 10,000 random points each of the given distribution, modeled as a beta function. Areas in green indicate results are significant (at 95% confidence) whereas areas in grey indicate that results are inconclusive. See text for a more in depth discussion including examples. The darker hued diagonal and center square indicates regions about which the table is symmetric: table is bisymmetric (symmetric and centrosymmetric). Note that for the cases of zero reactions and 20 reactions in 20 trials, the initiation probabilities from which the table was generated are estimated; i.e. it may be possible at those levels the p-values are lower than represented and reported here.

Each cell in Table 1 represents a hypothesis test between two sets of 20 trials. The color corresponds to the level of significance (by p-value) in rejecting the hypothesis that the probabilities of initiation for the two tests of 20 trials are equivalent (with the alternative hypothesis being that they are not equivalent). For example, PETN exhibits 3 reactions in 20 trials at a 16 cm drop height for impact sensitivity and Sample C exhibits 9 reactions in 20 trials at that same height. Using Table 1, it cannot be stated with statistical significance that Sample C is different than PETN. Table 1 shows the results of such a hypothesis test represented by the cell where row 3 (representing 3 reactions in 20 trials) intersects with column 9 (representing 9 reactions in 20 trials); the cell is gray indicating that it cannot be stated with 95% confidence that the initiation probabilities as tested of the two materials differ. However, if Sample C had resulted in 10 reactions in 20 trials it could have been concluded that Sample C is more sensitive than PETN with 95% confidence. A detailed description of the methodology and steps of such a hypothesis test can be downloaded here under “Chart Significance Method and PROBIT Comparisons”.

**Table 2 Matrix** of significance for 2 sets of 10 binomial trials Unique p-values given are an average of two Monte-Carlo calculations of 10,000 random points each of the given distribution, modeled as a beta function. Areas in green indicate results are significant (at 95% confidence) whereas areas in grey indicate that results are inconclusive. See text for a more in depth discussion including examples. The darker hued diagonal and center square indicates regions about which the table is symmetric: table is bisymmetric (symmetric and centrosymmetric). Note that for the cases of zero reactions and 10 reactions in 10 trials, the initiation probabilities from which the table was generated are estimated; i.e. it may be possible at those levels the p-values are lower than represented and reported here.

The variability in the observed reaction probability, e.g. 4 of 10 reactions seen followed by perhaps 7 of 10 reactions, is reduced as the number of trials increases. This analysis assumes that the probability of initiation is exactly the same each and every trial; if variation is introduced from either the operator, machine, or substance the initiation probability is likely no longer constant across trials. The results in Table 1 and 2 are the best case (i.e. the inconclusive band is as narrow as it can get) where the initiation probability is constant for each trial.

## PROBIT Comparison

A PROBIT plot relates the event probability to an energetic impetus. In energetic manufacturing and testing the impetus is usually impact, friction, or ESD. PROBIT plots are useful in estimating the event probability or initiation probability at impetus values that have not been specifically tested; they are also useful in comparing the sensitivity of two conditions or materials. Comparing two material’s sensitivity through PROBIT plot comparison can be more accurate than a comparison at a single energy level. A method to compare PROBIT plots is reviewed here.

PROBIT plots present non-linear behavior in a linear way. For example, most initiation phenomena are normally distributed with transition areas (regions where the probability of initiation changes from near 0 to near 1) of varying width. PROBIT plots represent the curve linearly thus showing the low probabilities that are close to zero with better resolution.

Comparison between the characteristic responses (initiation probability versus the impetus energy) is valuable in determining material or parametric differences. Here we describe a method to compare linearly represented initiation data. The below method is similar to the Hercules Parallel Line Assay program which was used to compare different sets of PROBIT data and to combine them to get a representative line to use for quantitative analysis.

There are simple methods that have been included in many software packages to statistically compare linear regression coefficients. The linear coefficients describing the initiation probability as a function of energy can easily be obtained when plotting the data on a PROBIT plot. The specific details used to perform a statistical comparison of the parameters of two linear relationships can be downloaded here under “Chart Significance Method and PROBIT Comparisons”, here we discuss example results.

Suppose that an impact test is completed on two substances that yield PROBIT plots. The sensitivity results for the two substances are plotted in a log-normal way (PROBIT line); experience at Hercules Inc . Aerospace Division (now part of Alliant Techsystems Inc.) indicates that a log-normal relationship best describes the relationship between the impetus energy and the probability of initiation.

A simple way to compare the two PROBIT relationships is to compare the slopes and intercepts of the regression lines. If the slopes or intercepts are statistically different then it’s likely the materials have different sensitivities; however, if the slopes and intercepts are not statistically different, it cannot be concluded that the substances have statistically different sensitivities (given, of course, the log-normal relationship is true).