Concepts
When training
When you train a machine learning model, a portion of the data is held out in order to evaluate the performance of the machine learning model (this “held-out” data is then called the test set).
The result of this evaluation operation is what can be seen in the Results screens of the models:
-
Performance metrics
-
Confusion matrix
-
Decision charts
-
Density charts
-
Lift charts
-
Error deciles
-
Partial dependences
-
Subpopulation analysis
-
…
For more details on all possible result screens, see Prediction Results
Each time you train a model in the visual analysis, and each time you retrain a saved model through a training recipe, this produces a new version of the model and its associated evaluation.
Subsequent evaluations
In addition to the evaluation that is automatically generated when training a model, it can be useful to evaluate a model on a different dataset, at a later time.
This is especially useful to detect Drift , i.e. when a model does not perform as well anymore, usually because the external conditions have changed.
In DSS, creating subsequent evaluations is done using an Evaluation recipe . These evaluations are stored in a Model Evaluation Store