WhyLogsRun Objects#

class WhyLogsRun(object)


| log_pandas(df: pd.DataFrame, dataset_name: Optional[str] = None)

Log the statistics of a Pandas dataframe. Note that this method is additive within a run: calling this method with a specific dataset name will not generate a new profile; instead, data will be aggregated into the existing profile.

In order to create a new profile, please specify a dataset_name


  • df: the Pandas dataframe to log
  • dataset_name: the name of the dataset (Optional). If not specified, the experiment name is used


| log(features: Dict[str, any] = None, feature_name: str = None, value: any = None, dataset_name: Optional[str] = None)

Logs a collection of features or a single feature (must specify one or the other).


  • features: a map of key value feature for model input
  • feature_name: a dictionary of key->value for multiple features. Each entry represent a single columnar feature
  • feature_name: name of a single feature. Cannot be specified if 'features' is specified
  • value: value of as single feature. Cannot be specified if 'features' is specified
  • dataset_name: the name of the dataset. If not specified, we fall back to using the experiment name



Hijack the mlflow.models.Model.log method and upload the .whylogs.yaml configuration to the model path This will allow us to pick up the configuration later under /opt/ml/model/.whylogs.yaml path


enable_mlflow() -> bool

Enable whylogs in mlflow module via mlflow.whylogs.


True if MLFlow has been patched. False otherwise.

.. code-block:: python :caption: Example of whylogs and MLFlow

import mlflow
import whylogs
import numpy as np
import pandas as pd
pdf = pd.DataFrame(
data=[[1, 2, 3, 4, True, "x", bytes([1])]],
columns=["b", "d", "a", "c", "e", "g", "f"],
active_run = mlflow.start_run()
# log a Pandas dataframe under default name
# log a Pandas dataframe with custom name
mlflow.whylogs.log_pandas(pdf, "another dataset")
# Finish the MLFlow run