In addition to tabular and textual data, whylogs can generate profiles of image data. whylogs automatically computes a number of metrics relative to image data. These include the following.
- Brightness (mean, standard deviation)
- Hue (mean, standard deviation)
- Saturation (mean, standard deviation)
- Image Pixel Height & Width
- Colorspace (e.g. RBG, HSV)
We will demonstrate WhyLabs image logging & monitoring capabilities using sets of images with some anomalies injected into the raw data. Consider the following sets of images.
Set 1- These images appear roughly uniform. We will assume this to be our baseline.
Set 2- These images appear to have had their color channels switched which may result from using a different library or library version to read in images or perhaps an issue introduced into the data pipeline.
Set 3- These images appear to have some that are darker than others, which could result from using different devices for taking photographs or different device settings such as shutter speed.
The following code demonstrates how to log a profile for a folder of images.
Note that the log_image method only accepts a single image as input. Users are advised to loop through their collection of images within a single logger session to log an aggregate profile for their image set.
Once uploaded to WhyLabs, these metrics are tracked similarly to descriptive statistics generated when profiling tabular data. In the image below, we see that the mean brightness of logged images suddenly dropped on February 23rd, causing a spike in the distribution distance as compared to the reference profile. This anomaly is associated with images from Set 3.
Similarly, we see a spike in the image hue on February 22nd, which corresponds to the swapped color channels in Set 2.
Furthermore, users can combine profiles from different data types into a single WhyLabs model due to the mergeability property of whylogs profiles. This is useful for cases in which multi-modal models are used, or if there is metadata available to supplement your dataset.
Suppose each image set was associated with a tabular dataset containing metadata like the shutter speed and library version associated with the camera used for the photograph and the image library used for pre-processing. This tabular metadata can be included in the logged profiles by adjusting the previous code as follows.
When viewing the results in WhyLabs, we now see that the tabular metadata is monitored alongside the image data. In fact, this can be used to help troubleshoot in some cases. In the image below, we find that an anomaly in brightness correlates with an anomaly in the shutter speed used.
Some images contain EXIF data which typically includes metadata stored by the device which took a photograph. If EXIF data is available for images profiled by whylogs, it will automatically be included in the image profile.
Computer vision use cases can be highly specific. For this reason, users are able to define their own custom functions to operate on an image array when logging an image profile. In the following example, a custom function is built to extract the blue channel of the image.
See more in this example notebook for image profiling.