In order to detect non-randomized treatment, the propensity model is used to discriminate treated from control individuals. If the classifier has an accuracy strictly higher than the expected maximal accuracy (population ratio of the most present group among treated and control, {{ binomialTest.accuracyReference | nicePrecision : 2 }}), treatment and control data can be distinguished which implies that the treatment is not randomized.
| Hypothesis tested | Treatment randomized (accuracy <= {{ binomialTest.accuracyReference | nicePrecision: 2}}) |
|---|---|
| Significance level | {{ 1 - binomialTest.confidenceLevel | nicePrecision: 2 }} |
| p-value | {{ binomialTest.pValue | nicePrecision:5 }} |
| Conclusion | Treatment non-randomization detected Inconclusive |
The hypothesis tested is that the treatment is randomized, in which case the expected maximal propensity model accuracy is {{ binomialTest.accuracyReference | nicePrecision : 2}} (treated and control populations indistinguishable). The observed accuracy might deviate from this expectation and the Binomial test evaluates whether this deviation is statistically significant, modelling the number of correct predictions as a random variable drawn from a Binomial distribution.
The p-value is the probability to observe this particular accuracy (or larger) under the hypothesis of non-randomized treatment. If this probability is lower than the significance level (i.e. 5%), the hypothesis of treatment randomization is rejected, meaning it is unlikely for the treatment to be randomized: the treatment is considered non-randomized.
The significance level indicates the rate of falsely-detected hypothesis violations (treatment actually randomized) accepted for the test.