Skip to main content

Model Explainability [BETA]

Maintaining explainability throughout a model’s life cycle is becoming increasingly important for running responsible ML applications. The WhyLabs AI Observatory makes this possible by helping you understand why and how a model is producing its output for both the global set of input data as well as for subsets of input data over time.

*This feature is in private beta. Users can reach out to us to have it enabled for their team.

Enabling Explainability#

Feature importance information can be supplied alongside with profile information. The user can choose the explainability technique that is most suitable for their use case. We recommend using absolute Shapley values for feature importance.

Global Feature Importance#

Global feature importance allows the user to understand which features have the strongest influence on a specific model’s output across an entire batch of input data. The default view of the explainability tab shows the feature importance data for the most recent profile available.

Tracking global feature importance helps users prioritize features for monitoring and to better quantify the severity of any anomalies which may occur.

global_feature_importance

Segment (Cohort) Feature Importance#

ML practitioners are often interested in understanding feature importances for individual subsets of data and how these differ at the segment level. whylogs allows users to define segments when profiling a dataset, which enables this level of observability within the WhyLabs platform.

As an example, tracking feature importance at the segment level can reveal important differences in the influence that various input features have on a model’s output for users from different geographic regions. This can also reveal biases in our model which we want to avoid and help ML practitioners determine whether intervention is required.

Individual (Local) Explanations#

Approaches like LIME, SHAP, and integrated gradients provide interpretable information for each individual data point queried. In this case, users can profile the results from their favorite local explanation method alongside their data using whylogs.

Tracking feature importance at the individual level can reveal important drift and distributional issues that global- and segment-level feature importance can miss. For example, a sudden change in the distribution of local explanations may reveal how incoming data differs in proximity to previously seen data or the decision boundary of your model.

Comparing Feature Importance#

Within the Explainability tab, users can directly compare feature importances for multiple profiles.

comparing_feature_importance

In the event of data drift, this can help users gauge the impact that drift has had on their model health. For example, the distribution of a particular feature may drift outside of the range that a model was originally trained on. The result of this may lead to extreme predictions from the model and an undesired spike in that feature’s importance.

Comparing feature importance for multiple profiles can also help users examine how relative feature importance has changed following the retraining of a model. These changes may offer valuable insights which influence how features are prioritized when monitoring, how severity levels are assigned, or how users segment their profiles upon uploading.

Monitoring Powered by Explainability#

WhyLabs automatically classifies the most important features as “critical features”, allowing users to easily target these features during monitor configuration.

monitoring_by_importance

WhyLabs also displays these critical features as part of a model’s summary, allowing users to maintain visibility into the features which have the most influence on the model’s behavior.

explainability_summary

Debugging Powered by Explainability#

As detailed above, maintaining explainability enables powerful model debugging capabilities. These include:

  • Identifying bias in a model
  • Understanding the impact of drift on model behavior
  • Uncovering differences in model behavior for various data segments
  • Informing decisions about model architecture (segments with significantly different feature importances may require different models)
  • Identifying most important features to monitor and determining severity of notifications
Prefooter Illustration Mobile
Run AI With Certainty
Get started for free
Prefooter Illustration