LLM Integrations
We have created a dedicated LLM version of the whylogs container to integrate your existing LLM applications with WhyLabs. The whylogs container is a low code solution that relies on API calls to asynchronously profile and monitor your data with WhyLabs. To learn more about how to use the whylogs container, refer to the docs
Usage
The /v1/chat/completions
endpoint on the container serves as a proxy for making requests to OpenAI's Chat Completions API.
It takes a request prompt and forwards it to the OpenAI API. Additionally, it transforms the user prompt message and the
generated response from OpenAI into whylogs
profiles using langkit and upload these profiles to WhyLabs automatically.
By using this proxy, your LLM application is automatically monitored by WhyLabs while maintaining a similar input and ouput
experience for the API client.
The idea for the OpenAI proxy endpoint is to change as little as possible to OpenAI's Completions request structure.
It only requires an extra request header, whylabs_dataset_id
so it can tie a specific prompt to an existing WhyLabs dataset-id.
resp = requests.post(
url="http://localhost:8000/v1/chat/completions",
headers = {
"Content-Type": "application/json",
"Authorization": "Bearer $OPENAI_API_KEY",
"whylabs_dataset_id": "model-X"
},
json={
"model": "gpt-3.5-turbo",
"messages": [
{
"role": "user",
"content": "Say this is a test!" }
],
})
Will return:
{
"id":"chatcmpl-{ID}",
"object":"chat.completion",
"created":{timestamp},
"model":"gpt-3.5-turbo-0613",
"choices": [
{
"index": 0,
"message": {
"role": "assistant",
"content":"This is a test!"
},
"finish_reason":"stop"}
],
"usage": {
"prompt_tokens": 13,
"completion_tokens": 5,
"total_tokens": 18
}
}
And also gets asynchronously profiled and pushed to WhyLabs on a pre-defined cadence of 5 minutes.
The LLM dataset is pre-configured to be daily, so all prompts and responses will get merged to the daily profile.
For failed validations the response will be a JSON object containing the validation results metadata, as shown below.
{
"metadata": {
"prompt_id": "2477935a-9ffa-49ba-a078-de3bfb4dbf4c",
"validator_name": "sentiment_prompt_-0.2_validator",
"failed_metric": "sentiment_prompt_-0.2",
"value": 0.0,
"timestamp": 1697062002275,
"is_valid": false
}
}
Configuration
To set up the LLM container, you will need to:
- Define environment variables
- Create your LLM configuration files
- Customize the deployment with Docker
- Build the image & deploy the container
We will demonstrate these sections in detail on the topics below.
1. Define environment variables
To set up the following environment variables on a local.env
file:
# Your WhyLabs org id
WHYLABS_ORG_ID=org-0
# An api key from the org above
WHYLABS_API_KEY=xxxxxxxxxx.xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
# One of these two must be set
# Set the container password to `password`. See the auth section for details
CONTAINER_PASSWORD=password
# If you don't care about password protecting the container then you can set this to True.
DISABLE_CONTAINER_PASSWORD=True
2. Create your LLM configuration files
There are two modes that the proxy can work with your ChatGPT application and the LLM whylogs container.
2.1 Proxy configuration
To enable this mode, you will need to set the whylabs_dataset_id
files to the dataset-id
that you want to use.
If you don't have a dataset-id
configured yet, you can create one on the WhyLabs UI.
The file should be placed under whylogs_config/{name_of_the_file}.yaml
. It is possible to define multiple configuration files, one per whylabs_dataset_id
. To correctly configure the validations, the file must meet the following schema:
whylabs_dataset_id: model-15
# The following keys are placeholders for future WhyLabs-managed configurations, but required to correclty set the container up
policy: my_policy
id: 9294f3fa-4f4b-4363-9397-87d3499fce28 # a random uuid
policy_version: 1
schema_version: 0.0.1
2.2 Validation configuration
The second mode is a validation mode, where the container will also validate the prompt and/or response messages against a set of user-defined rules and thresholds. To enable the validation mode, you will need to create a YAML file containing the configuration for the langkit modules and thresholds as well as the same headers and placed on the same path as described above.
whylabs_dataset_id: model-15
policy: my_new_policy
id: 9294f3fa-4f4b-4363-9397-87d3499fce28
policy_version: 1
schema_version: 0.0.1
rules:
prompt:
- module: themes
upper_threshold: 0.2
- module: sentiment
lower_threshold: 0.0
- module: textstat
metric: character_count
upper_threshold: 200
lower_threshold: 2
- module: toxicity
upper_threshold: 0.7
- module: regexes
config_path: "whylogs_config/path/to/regexes.json"
response:
- module: toxicity
upper_threshold: 0.9
- module: input_output
upper_threshold: 0.5
3. Customize the deployment with Docker
Now create a Dockerfile that extends the LLM container image and copies the configuration files to the correct path, as shown below:
FROM public.ecr.aws/whylabs-dev/whylabs-container:latest
COPY ./whylogs_config/ /opt/whylogs-container/whylogs_container/whylabs/whylogs_config/
4. Build the image & deploy the container
We have created an example repository that you can use to deploy the container on your local machine with Docker compose.
docker compose up --build
To learn more about other configurations to the container, refer to this section
Upcoming features
At this point, this project is at a beta stage, so any feedback is well appreciated. Some of the features that we are currently working on to make it even more useful are:
- Configuring different preset actions for different thresholds
- Persist failed prompt's metadata to WhyLabs as Debug Events
- Managing configuration directly on WhyLabs
Troubleshooting
If you need help setting up the container then reach out to us on Slack or via email. See the Github repo for submitting issues and feature requests.