Propensity distribution

Propensity positivity

The positivity assumption states that no subset of the feature space should have a null probability of treatment or control.

This chart can help detect if this assumption is violated. One way of finding such violations is to check that all bins of the probability distribution are not exclusively populated (in a statistically significant way) with either treated or control individuals.

Propensity probability calibration

Calibration of predicted treatment probabilities

Calibration denotes the consistency between predicted probabilities and the actual frequencies observed on a test dataset.

Calibration data points (displayed as single dots) are built by computing:

  • the averagae prediction (x-axis)
  • the frequency of a predicted class (y axis)
for predictions within a range of probabilities, e.g. [0, 0.1), [0.1, 0.2), etc. up to [0.9, 1).

A perfectly calibrated model should have calibration data points that are on the diagonal line.

In reality the calibration data points often don't match the diagonal line exactly and the average distance between them measures the quality of the calibration as the calibration loss.

The calibration loss is computed as the absolute difference between the calibration data points and the diagonal, averaged over the test set, weighted by the number of elements used to compute each point.

A calibration curve (displayed as a solid line) is computed as a smoothed version of the calibration data points, taking into account the (weighted) number of points in each calibration data point, with the diagonal line as a prior.

The calibration loss is .

Calibration curve data unavailable for this model. Try retraining the model.
No propensity model was trained. You can enable treatment analysis from the Design.
A propensity model predicts the probability of being treated for each individual, enabling positivity analysis:
  • Propensity distribution histogram can help detect positivity assumption violation
  • Propensity calibration chart assesses the consistency between predicted treatment probabilities and the actual frequencies observed on a test dataset