Metrics

This section describes:

Monitorable metrics that can be monitored in the WhyLabs AI Control Center and queried using the WhyLabs API.
Metrics resulting from monitoring that can be queried using the WhyLabs API.

Monitorable Metrics

Monitorable metrics are either:

Column metrics specific to a column, such as statistical values and distribution metrics.
Dataset metrics relating to the overall dataset or a segment of it, including performance metrics and integration health.

These metrics can be monitored by specifying them as the metric in the monitor configuration, and many of them can be queried using the WhyLabs Data API.

Column metrics can be monitored for specific columns in the dataset, and must be used in an analyzer with a targetMatrix of type column as shown in the Targeting Columns section.

Dataset metrics must be used in an analyzer with a targetMatrix of type dataset as shown in the Targeting Datasets section.

Distribution Metrics

The distribution metrics are column metrics that can be used in analyzers of type drift. The frequent_items metric can also be used with an analyzer of type frequent_string_comparison.

Metric name	Description	Data API Support
frequent_items	A complex metric representing the counts of the most frequent discrete values in the column	No
histogram	A complex metric representing the numeric distribution of values in the column using counts of binned numeric values	No

Statistical Value Metrics

These statistical metrics are column metrics that can be used in analyzers with types of diff, fixed, seasonal and stddev.

Metric name	Description	Data API Support
min	The minimum value in the column	Yes
max	The maximum value in the column	Yes
mean	The mean of values in the column	Yes
median	The median value (i.e. 50th percentile) of the values in the column	Yes
quantile_5	The 5th percentile value of the column	Yes
quantile_25	The 25th percentile value of the column	Yes
quantile_75	The 75th percentile value of the column	Yes
quantile_95	The 95th percentile value of the column	Yes
quantile_99	The 99th percentile value of the column	Yes
std_dev	The standard deviation of the values in the column	Yes
variance	The variance of the values in the column	Yes

Count Metrics

Count metrics are column metrics that can be used in analyzers of type diff, fixed and stddev.

Metric name	Description	Data API Support
count	The count of values in the column	Yes
count_null	The count of missing/null/NaN values in the column	Yes
count_null_ratio	The ratio of missing/null/NaN values in the column	Yes
unique_est	An estimate of the count of unique values in the column	Yes
unique_est_ratio	The ratio of the unique value count estimate for the column	Yes
unique_est_lower	The lower bound on the unique value count estimate for the column	Yes
unique_est_upper	The upper bound on the unique value count estimate for the column	Yes
count_bool	The count of boolean values in the column	Yes
count_bool_ratio	The ratio of boolean values in the column	Yes
count_integral	The count of integer values in the column	Yes
count_integral_ratio	The ratio of integer values in the column	Yes
count_fractional	The count of fractional values in the column	Yes
count_fractional_ratio	The ratio of fractional values in the column	Yes
count_string	The count of string values in the column	Yes
count_string_ratio	The ratio of string values in the column	Yes

Other Column Metrics

The inferred_data_type metric is a colum metric that can be used in analyzers of type comparison.

Metric name	Description	Data API Support
inferred_data_type	The inferred data type of the values in the column	No

Classification Metrics

Classification metrics are dataset metrics that can be used in analyzers of type diff, fixed and stddev, providing classification model metrics have been uploaded for the dataset.

Metric name	Description	Data API Support
classification.f1	F1 score	Yes
classification.precision	Precision	Yes
classification.recall	Recall	Yes
classification.accuracy	Accuracy	Yes
classification.auroc	Area under the receiver-operator curve	Yes

Regression Metrics

Regression metrics are dataset metrics that can be used in analyzers of type diff, fixed and stddev, providing regression model metrics have been uploaded for the dataset.

Metric name	Description	Data API Support
regression.mse	Mean squared error	Yes
regression.mae	Mean absolute error	Yes
regression.rmse	Root mean square error	Yes

Integration Health Metrics

Integration health metrics are dataset metrics that should only be used in an analyzer of type fixed.

Metric name	Description	Data API Support
missingDatapoint	0 if the last batch contained profile data; 1 if it is missing	No
secondsSinceLastUpload	Number of seconds elapsed since the last profile upload	No

Metrics resulting from monitoring

This section describes the set of monitor metrics that are only available after a monitor has analyzed the profile data. All monitor metrics are currently scoped to a specific column and can be scoped to a specific monitor.

Drift

Drift metrics are only available for monitors of type drift.

Metric name	Description	Data API Support
avg_drift	The average of the drift values determined for a specific batch	Yes
max_drift	The maximum of the drift values determined for a specific batch	Yes
min_drift	The minimum of the drift values determined for a specific batch	Yes

Anomaly Count

Available for all monitors.

Metric name	Description	Data API Support
anomaly_count	A count of the anomalies that have been detected in a specific batch	Yes

Note: The API requires a column field, so for monitors targeting the dataset as a whole (e.g. performance monitors), the column field should be set to __internal__.datasetMetrics.

Diff

Diff metrics are only available for monitors of type diff.

Metric name	Description	Data API Support
min_diff	The minimum diff value measured for a specific batch	Yes
max_diff	The maximum diff value measured for a specific batch	Yes

Threshold

Threshold metrics are available for any monitor which has absolute thresholds (fixed type) or calculated thresholds (e.g. stddev, pct types).

Metric name	Description	Data API Support
min_threshold	The minimum threshold value determined for a specific batch	Yes
max_threshold	The maximum threshold value determined for a specific batch	Yes

Monitorable Metrics​

Distribution Metrics​

Statistical Value Metrics​

Count Metrics​

Other Column Metrics​

Classification Metrics​

Regression Metrics​

Integration Health Metrics​

Metrics resulting from monitoring​

Drift​

Anomaly Count​

Diff​

Threshold​