Performance Metrics
In addition to profiling datasets which feed ML models, WhyLabs can automatically track a variety of performance metrics based model predictions. In order to make this possible, the following conditions must be met.
Users must have a list of predictions and actuals, and (for classification models only) confidence scores The relevant model must have a model type declared as classification or regression.
Users can declare a model type upon creating a new model from either the Model Dashboard or the Model Management section within settings.
For any models already assigned the “Unknown” model type, users can declare a new model type by editing the settings in the Model Management section within Settings.
#
RegressionIn the case of regression models, the following metrics are tracked:
- Total output and input count
- Mean Squared Error
- Mean Absolute Error
- Root Mean Squared Error
When logging performance metrics for Regression Models, the following code can be used.
Similar to other logging methods, users can optionally provide a dataset_timestamp parameter when initializing the logger in cases where backfilling is required.
See an example notebook which demonstrates the use of performance metrics for regression models.
#
ClassificationIn the case of classification models, the following metrics are tracked:
- Total output and input count
- Accuracy
- ROC
- Precision-Recall chart
- Confusion Matrix
- Recall
- FPR (false positive rate)
- Precision
- F1
The metrics above are supported for both binary classification and multi-class classification.
The code for logging classification metrics is similar to that of regression, with addition of scores associated with each prediction. These scores are often probabilities or some other confidence measure.
See an example notebook which demonstrates the use of performance metrics for classification models.
#
Performance Metrics with PySparkPerformance metrics can also be logged with spark using the syntax below.
See a full example of logging performance metrics with pyspark in this notebook.
Don't hesitate to reach out if you would like to log performance methods via one of our other integrations!
#
Performance ComparisonWhyLabs allows users to compare the performance of two models side by side. Users can select two models of the same type (classification or regression) in the upper left dropdowns. WhyLabs will display plots of performance metrics from each model.
This makes it a simple matter to determine the superior model when comparing multiple versions of a model.