What is WhyLabs
WhyLabs is an AI observability platform that allows you to monitor your data pipelines and Machine Learning models in production. If you deploy an ML model but don’t have visibility into its performance, you risk doing damage to your business due to model degradation resulting from things like data/concept drift, data corruption, schema changes and more.
WhyLabs uses data profiles to monitor your datasets and ML models, which are statistical snapshots of your data that preserve privacy and also allow for monitoring at scale. More information about data profiles and how you can use them can be found here.
Data profiling with WhyLabs is a better strategy to monitor ML models and datasets because of:
- Privacy: WhyLabs won't ever ask users for raw data, as it only cares about the statistical values which can indicate potential anomalies
- Cost effectiveness: WhyLabs' data profiles are lightweight and it will only store the amount of information that is actually needed to monitor your data.
- Scalability: Data profiles are mergeable in any order, which opens ways for
MapReduceoperations on larger scales of data, leveraging Big Data technologies such as Ray, Apache Spark, Beam, etc.
WhyLabs relies on profiles generated by the open source whylogs library to monitor the data flowing through your pipeline or being fed to your model. For language models WhyLabs relies on LangKit to extract LLM and NLP specific telemetry. These profiles allow you to monitor the performance of your ML models, as they can capture prediction metrics.
Profiles created via whylogs or LangKit contain a variety of statistics describing your dataset and vary depending on whether you’re profiling tabular data, text data, image data, etc. These profiles are generated locally so your actual data never leaves your environment. Profiles are uploaded to the WhyLabs AI Observability Platform via API which provides extensive monitoring and alerting capabilities right out-of-the-box.
The WhyLabs approach to AI observability and monitoring is based on cutting edge research. Flexibility is a priority and the platform provides many customizable options to enable use-case-specific implementations.
With WhyLabs, you can prevent this performance degradation by monitoring your model/dataset with a platform that’s easy to use, privacy preserving, and cost efficient. To read more about WhyLabs, check out the WhyLabs Overview
What is whylogs
whylogs is the open source standard for profiling data. whylogs automatically creates statistical summaries of datasets, called profiles, which imitate the logs produced by other software applications. The library was developed with the goal of bridging the data logging gap by providing profiling capabilities to capture data-specific logs. whylogs profiles are descriptive, lightweight, and mergeable, making them a natural fit for data logging applications. whylogs can generate logs from datasets stored in Python, Java, or Spark environments.
To read more about WhyLabs, check out the whylogs Overview
What is LangKit
LangKit is built on whylogs and is designed for language models. LangKit provides out-of the box telemetry from the prompts and responses of LLM to help you track critical metrics about quality, relevance, sentiment, and security. LangKit is designed to be modular and extensible, allowing users to add their own telemetry and metrics.
To read more about what you can do with LangKit, check out the LLM overview.
How to Navigate These Docs
Our documentation contains conceptual explanations, technical specifications, and tutorials.
In this Overview section, you'll find conceptual explanations that give you context about the WhyLabs Platform and the open source whylogs library.
In the Use Cases section, you'll find tutorials that walk you through how to do various things with whylogs and WhyLabs, such as Generating Profiles or Checking Data Quality.
In the Integrations section, you'll find more tutorials that specify how to integrate with various other DataOps and MLOps tools, such as MLFlow and Databricks.
In the WhyLabs Platform section, you'll primarily find technical specifications of the WhyLabs platform, as well as some conceptual explanations of its features.
In the whylogs API section, you'll primarily find technical specifications of the whylogs library, as well as some conceptual explanations of its features.