Cloud and On-Premises Deployments
The whylogs library was designed to integrate with any data pipeline. Part of this is achieved by the fact that whylogs is agnostic to the infrastructure on which it runs, whether it be on popular cloud services like Amazon Web Services (AWS), Google Cloud Platform (GCP), Azure, etc., or on-premises.
Integrating whylogs with one of these environments is a simple matter of installing the library in the cloud or on-premises environment and incorporating whylogs in your Python code, Java code, Spark jobs, etc.
Below are several examples in which this is done.
AWS
whylogs can be integrated with any AWS offering which allows users to run Python, Java or Apache Spark. This includes popular services like Sagemaker, Lambda, EKS, ECS, EMR, etc. The example below demonstrates how a user would install whylogs on an EC2 instance.
First, head to the AWS Console and select Launch a virtual machine
Next, select an OS Image. For the examples in this document, Ubuntu v20.x is used. Configure the virtual machine as desired, and click Launch Instance.
After the instance is created and running, select the instance from the list of virtual machines, and click Connect.
Select EC2 Instance Connect, enter your username, and click Connect
Users will enter the command prompt for their newly created virtual machine. The exact process and syntax for installing whylogs will depend on the operating system it’s version used by the virtual machine. In any case, users will need to:
- Install Python if it’s not already installed.
- Install pip if its not already installed
- Install whylogs using pip
In the case of Ubuntu 20.x, Python is already installed. In order to install pip, users may need to first update package lists tracked by Ubuntu. This can be done with the following command:
sudo apt-get update
Users can then install pip, and then install the whylogs library:
sudo apt install python3-pip
pip install whylogs
Users can then update their Python code, Java code, etc. in order to incorporate whylogs' logging capabilities into their pipeline.
Google Cloud Platform (GCP)
whylogs can be integrated with any GCP offering which allows users to run Python, Java or Apache Spark. This includes popular services like GCE, GKE, GCF, GAE, and GCR. The example below demonstrates how a user would install whylogs on a virtual machine hosted on Google Compute Engine (GCE).
First, head to the GCP Console. From there, select VM Instances from the Compute Engine menu.
Click Create Instance and configure the instance as desired.
Wait for the instance to be created and running. Once complete, click on the new instance.
Under Details choose the desired connection method.
Users will arrive at the command prompt where they can take steps to install Python, pip, and then whylogs according to the operating system of their chosen machine image. For Ubuntu v20.x, the same commands from the Amazon EC2 example are relevant:
sudo apt-get update
sudo apt install python3-pip
pip install whylogs
Microsoft Azure
whylogs can be integrated with any Azure offering which allows users to run Python, Java or Apache Spark. This includes popular services like Azure Web Apps, Azure Functions, Azure Web Jobs, Azure Kubernetes Service, and Azure Virtual Machines. The example below demonstrates how a user would install whylogs on an Azure Virtual Machine.
First, head to the Azure Portal. From here, click Create under the Virtual machines tile.
Configure the virtual machine as desired.
Once the instance is created and running, click on the new instance and connect. The Bastion option will allow users to connect to their virtual machine directly through a browser based interface.
Again, the steps and syntax for installing whylogs will depend on the operating system of the chosen virtual machine, but the steps for Ubuntu v20.x remain unchanged from the previous examples:
sudo apt-get update
sudo apt install python3-pip
pip install whylogs
On-Premises
Similar to the cloud services discussed above, whylogs can be integrated with any on-premises environment which can run Python, Java, or Apache Spark. Again, the exact steps will vary depending on the server’s operating system.