Calibration requires additional computations at scoring time and thus can increase latency and decrease throughput of deployed models.
Calibrating a classification model on the test set alters the log-loss metrics (usually in a positive fashion).
Isotonic regression calibration performed on the test set might also alter the other metrics.
Moreover SQL scoring with calibrated probabilities is not supported.