In this section we will learn how to integrate whylogs into a FastAPI server that uploads profiles to WhyLabs.
The following is an example of a simple prediction endpoint.
To get the above endpoint integrated with WhyLabs, we will need to:
- Start a logger, which will keep a profile in memory up until when it's time to merge it
- Profile the output DataFrame with
- Close the logger when the app is shutdown
⚠️ Best practice is to have these environment variables set on the machine/environment level (such as per the CI/QA machine, a Kubernetes Pod, etc.) to avoid checking those credentials into source control.
In general, whylogs is quite fast with bulk logging, but it does have fixed overhead per log call, so some traffic patterns may not lend themselves well to logging synchronously. If you can't afford the additional latency overhead that whylogs would take in your inference pipeline then you should consider decoupling whylogs.
Instead of directly logging the data on every call, you can send the data to a message queue like SQS to asynchronously log.
You can also use our whylogs container to host a dedicated profiling endpoint. You would be creating IO bound rest calls on each inference rather than executing CPU bound logging.
In this documentation page, we brought some insights on how to integrate a FastAPI prediction endpoint with WhyLabs, using whylogs profiles and its built-in WhyLabs writer. If you have questions or wish to understand more on how you can use WhyLabs with your FastAPI models, contact us at anytime!