Skip to main content

Initialization and Authentication with whylogs

The recommended way to initialize whylogs when you're sending profiles to WhyLabs is by calling why.init() before any of your profiling code.

import whylogs as why
import pandas as pd

why.init() # Automatically determines how to authenticate
df = pd.read_csv('data.csv') # get some data
profile = why.log(df) # profile the data and automatically upload to WhyLabs

The intent of why.init is that you can always call it at the start of your program and not worry too much about the details of authentication and initialization.

How why.init works

When you call why.init it will attempt to determine what should happen with your profiles by creating a session with a particular type. A session isn't something you have to care about, it's mostly just the current program's lifespan, or the current notebook kernel's lifespan, etc. A session can have three types:

  • WhyLabs Authenticated (WHYLABS) - Assumes you will be eventually uploading the profiles you generate to WhyLabs.
  • WhyLabs Anonymous (WHYLABS_ANONYMOUS) - Assumes you will be eventually uploading to WhyLabs as well, but doesn't require an api key or organization id, it just creates a new anonymous session for you that can be viewed by anyone that has the links that are generated. You can share the links with whoever you like, or no one.
  • Local (LOCAL) - Doesn't do anything with your profiles automatically and doesn't require any credentials or configuration.

The session type is determined by looking at the current enviroment config, contents of the whylabs.ini config file, and hard coded (optional) config in your code. It is roughly as follows:

  1. If there is an api key directly supplied to init via why.init(whylabs_api_key='...'), then use it and authenticate the session as WHYLABS.
  2. If there is an api key in the environment variable WHYLABS_API_KEY, then use it and authenticate the session as WHYLABS.
  3. If there is an api key in the whylogs config file, then use it and authenticate the session as WHYLABS.
  4. If there is an anonymous session id in the whylogs config file then use it and authenticate the session as WHYLABS_ANONYMOUS.
  5. If we're in an interactive environment (notebook, colab, etc.) then prompt to pick a method explicitly.
  6. If we're not in an interactive environment and allow_anonymous=True, then authenticate session as WHYLABS_ANONYMOUS.
  7. If we're not in an interactive environment and allow_local=True, then authenticate session as LOCAL.

First time use

If this is your first time using whylogs and WhyLabs then you'll probably want to let the interactive prompt guide you. You can do this by either using why.init in a notebook or by running python -m whylogs.api.whylabs.session.why_init from the command line in an environment that you've installed whylogs into.

$ python -m whylogs.api.whylabs.session.why_init

Initialing session with config /home/user/.config/whylogs/config.ini
❓ What kind of session do you want to use?
⤷ 1. WhyLabs. Use an api key to upload to WhyLabs.
⤷ 2. WhyLabs Anonymous. Upload data anonymously to WhyLabs and get a viewing url.

Enter a number from the list: 1

Enter your WhyLabs api key. You can find it at https://hub.whylabsapp.com/settings/access-tokens: xxxxxxxxxx.xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx:org-xxxxxx

[OPTIONAL] Enter a default dataset id to upload to: model-54

✅ Using session type: WHYLABS
⤷ org id: org-JpsdM6
⤷ api key: 1y6ltXaa6a
⤷ default dataset: model-54

The interactive prompt is the same whether its in a notebook or via the cli. The only difference is that running it from the cli will wipe out the current state of the whylogs.ini config file and start fresh. Once you exit the prompt successfully you'll have a new whylogs.ini config file with the information that you entered and that will be used to determine your authentication method the next time why.init is run from any environment on that machine.

Logger output

After you initialize and use why.log you'll notice styled output intended for human consumption that includes links to view your profiled data. This output only happens when you're using the top level why.log method in an interactive environment (like a notebook). The usual method for profiling data in production is the rolling logger that accumulates data over time and uploads in the background. The rolling logger won't output any fancy summaries or links.

Anonymous usage

If you use whylogs with an anonymous session then the generated profiles will automatically be uploaded to WhyLabs under an anonymous org. This anonymous org looks just like a normal org but has restricted features. You can see an example of an anonymous org here.

Anonymous sessions are an easy way to get started with whylogs and WhyLabs without having create an account or deal with any configuration first. After you create an account you'll be able to claim the anonymous sessions you generated and import them into your new personal account by clicking on the signup banner at the top of the anonymous session.

If you've already generated an anonymous session then it's id will be stored locally in your whylogs.ini file and you'll continue to use it for new data until you claim it into your real account, if you care to at all.

Remember, anonymous sessions are just that: they allow anyone to view them without authentication if they have the link. Only you will have the generated links and session id of course, and you can share it with whoever you'd like. When you're ready to start profiling data that you don't want to be viewable via a link then you can create a WhyLabs account and reinitialize with python -m whylogs.api.whylabs.session.why_init, and choose a WhyLabs account.

Production usage

Production usage works the same way local usage works except you likely won't be in an interactive environment, so there will be no prompting in the session logic. The recommended way to set up your credentials in production is via the environment variables WHYLABS_API_KEY and WHYLABS_DEFAULT_DATASET_ID. Those will be automatically picked up and used by why.init. You can technically supply these directly as why.init(whylabs_api_key='..', default_dataset_id='...') but we discourage that because it implies that you would be committing that information to source control as well, which is a bad security practice.

We do need to know your organization id, but our latest api key format includes that id as xxxx.xxxxxx:orgId. If you have an older api key and you can't generate a new api key easily for whatever reason, you can also supply the WHYLABS_DEFAULT_ORG_ID environment variable to explicitly set your organization id.

Customizing init behavior

The fallback logic for why.init can be customized to a small extent. You can enable/disable the option to have anonymous WhyLabs sessions and local sessions. The result of which is that they are included/removed from the fallback logic executed in why.init. For example, if you realy want to make sure that an anonymous session isn't possible then you can initialize with why.init(allow_anonymous=False).

Prefooter Illustration Mobile
Run AI With Certainty
Get started for free
Prefooter Illustration