Metrics
This section describes:
- Monitorable metrics that can be monitored in the WhyLabs AI Control Center and queried using the WhyLabs API.
- Metrics resulting from monitoring that can be queried using the WhyLabs API.
Monitorable Metrics
Monitorable metrics are either:
- Column metrics specific to a column, such as statistical values and distribution metrics.
- Dataset metrics relating to the overall dataset or a segment of it, including performance metrics and integration health.
These metrics can be monitored by specifying them as the metric in the monitor configuration, and many of them can be queried using the WhyLabs Data API.
Column metrics can be monitored for specific columns in the dataset, and must be used in an analyzer with a targetMatrix
of type column
as shown in the Targeting Columns section.
Dataset metrics must be used in an analyzer with a targetMatrix
of type dataset
as shown in the Targeting Datasets section.
Distribution Metrics
The distribution metrics are column metrics that can be used in analyzers of type drift
. The frequent_items
metric can also
be used with an analyzer of type frequent_string_comparison
.
Metric name | Description | Data API Support |
---|---|---|
frequent_items | A complex metric representing the counts of the most frequent discrete values in the column | No |
histogram | A complex metric representing the numeric distribution of values in the column using counts of binned numeric values | No |
Statistical Value Metrics
These statistical metrics are column metrics that can be used in analyzers with types of diff
, fixed
, seasonal
and stddev
.
Metric name | Description | Data API Support |
---|---|---|
min | The minimum value in the column | Yes |
max | The maximum value in the column | Yes |
mean | The mean of values in the column | Yes |
median | The median value (i.e. 50th percentile) of the values in the column | Yes |
quantile_5 | The 5th percentile value of the column | Yes |
quantile_25 | The 25th percentile value of the column | Yes |
quantile_75 | The 75th percentile value of the column | Yes |
quantile_95 | The 95th percentile value of the column | Yes |
quantile_99 | The 99th percentile value of the column | Yes |
std_dev | The standard deviation of the values in the column | Yes |
variance | The variance of the values in the column | Yes |
Count Metrics
Count metrics are column metrics that can be used in analyzers of type diff
, fixed
and stddev
.
Metric name | Description | Data API Support |
---|---|---|
count | The count of values in the column | Yes |
count_null | The count of missing/null/NaN values in the column | Yes |
count_null_ratio | The ratio of missing/null/NaN values in the column | Yes |
unique_est | An estimate of the count of unique values in the column | Yes |
unique_est_ratio | The ratio of the unique value count estimate for the column | Yes |
unique_est_lower | The lower bound on the unique value count estimate for the column | Yes |
unique_est_upper | The upper bound on the unique value count estimate for the column | Yes |
count_bool | The count of boolean values in the column | Yes |
count_bool_ratio | The ratio of boolean values in the column | Yes |
count_integral | The count of integer values in the column | Yes |
count_integral_ratio | The ratio of integer values in the column | Yes |
count_fractional | The count of fractional values in the column | Yes |
count_fractional_ratio | The ratio of fractional values in the column | Yes |
count_string | The count of string values in the column | Yes |
count_string_ratio | The ratio of string values in the column | Yes |
Other Column Metrics
The inferred_data_type metric is a colum metric that can be used in analyzers of type comparison
.
Metric name | Description | Data API Support |
---|---|---|
inferred_data_type | The inferred data type of the values in the column | No |
Classification Metrics
Classification metrics are dataset metrics that can be used in analyzers of type diff
, fixed
and stddev
, providing classification model metrics have been
uploaded for the dataset.
Metric name | Description | Data API Support |
---|---|---|
classification.f1 | F1 score | Yes |
classification.precision | Precision | Yes |
classification.recall | Recall | Yes |
classification.accuracy | Accuracy | Yes |
classification.auroc | Area under the receiver-operator curve | Yes |
Regression Metrics
Regression metrics are dataset metrics that can be used in analyzers of type diff
, fixed
and stddev
, providing regression model metrics have been
uploaded for the dataset.
Metric name | Description | Data API Support |
---|---|---|
regression.mse | Mean squared error | Yes |
regression.mae | Mean absolute error | Yes |
regression.rmse | Root mean square error | Yes |
Integration Health Metrics
Integration health metrics are dataset metrics that should only be used in an analyzer of type fixed
.
Metric name | Description | Data API Support |
---|---|---|
missingDatapoint | 0 if the last batch contained profile data; 1 if it is missing | No |
secondsSinceLastUpload | Number of seconds elapsed since the last profile upload | No |
Metrics resulting from monitoring
This section describes the set of monitor metrics that are only available after a monitor has analyzed the profile data. All monitor metrics are currently scoped to a specific column and can be scoped to a specific monitor.
Drift
Drift metrics are only available for monitors of type drift
.
Metric name | Description | Data API Support |
---|---|---|
avg_drift | The average of the drift values determined for a specific batch | Yes |
max_drift | The maximum of the drift values determined for a specific batch | Yes |
min_drift | The minimum of the drift values determined for a specific batch | Yes |
Anomaly Count
Available for all monitors.
Metric name | Description | Data API Support |
---|---|---|
anomaly_count | A count of the anomalies that have been detected in a specific batch | Yes |
Note: The API requires a column field, so for monitors targeting the dataset as a whole (e.g. performance monitors),
the column field should be set to __internal__.datasetMetrics
.
Diff
Diff metrics are only available for monitors of type diff
.
Metric name | Description | Data API Support |
---|---|---|
min_diff | The minimum diff value measured for a specific batch | Yes |
max_diff | The maximum diff value measured for a specific batch | Yes |
Threshold
Threshold metrics are available for any monitor which has absolute thresholds (fixed
type) or
calculated thresholds (e.g. stddev
, pct
types).
Metric name | Description | Data API Support |
---|---|---|
min_threshold | The minimum threshold value determined for a specific batch | Yes |
max_threshold | The maximum threshold value determined for a specific batch | Yes |