One of the core tenets of WhyLabs is ensuring the safety of customer data. WhyLabs accomplishes this mainly by not collecting raw data in the first place. We utilize streaming algorithms to construct accurate statistical representations of your data, and use these statistics to monitor the quality of your data. Your raw data never leaves your defense perimeter.
We utilize an open source library called whylogs to generate statistical profiles of the raw data. All of the raw data processing happens on customer's side, by customer's machines. The generated statistical profiles for each monitored feature (for example, min, max, distribution, etc) are then sent to WhyLabs, and we monitor these statistics for drift.
These profiles generally do not contain personally identifiable or proprietary information in any meaningful form. Some of the statistics we are able to collect may result in the collection of data from sensitive features. For example, collecting Top-K Frequent Items on a
user_email feature would lead us to store the most frequent email addresses appearing within this feature. Customers are free to disable collection of Top-K or any other statistics on sensitive features, or to exclude such features from monitoring entirely. Some customers may opt to utilize methods such as one way hashing or encryption to protect sensitive data instead, in order to preserve the ability to monitor these features.
A similar concern may arise with regards to names of the features themselves. If any of the feature names are considered sensitive within a given model, this can be addressed the same way as described above.
WhyLabs utilizes industry standard data storage, security, and privacy practices. Statistical profiles are encrypted during transmission (from customers and within our own network) and while at rest. We do not share this data with any external parties. The data is made available only to the customer that owns it, as well as to WhyLabs employees with sufficient access rights (with customer permission) for debugging purposes only.
WhyLabs does not utilize customer data for any purpose other than to provide monitoring services.