Skip to main content

Table of Contents

Defines the ColumnProfile class for tracking per-column statistics

ColumnProfile Objects

class ColumnProfile()

Statistics tracking for a column (i.e. a feature)

The primary method for

Parameters

name : str (required) Name of the column profile number_tracker : NumberTracker Implements numeric data statistics tracking string_tracker : StringTracker Implements string data-type statistics tracking schema_tracker : SchemaTracker Implements tracking of schema-related information counters : CountersTracker Keep count of various things frequent_items : FrequentItemsSketch Keep track of all frequent items, even for mixed datatype features cardinality_tracker : HllSketch Track feature cardinality (even for mixed data types) constraints : ValueConstraints Static assertions to be applied to numeric data tracked in this column

TODO:

  • Proper TypedDataConverter type checking
  • Multi-threading/parallelism

track

 | track(value, character_list=None, token_method=None)

Add value to tracking statistics.

to_summary

 | to_summary()

Generate a summary of the statistics

Returns

summary : ColumnSummary Protobuf summary message.

merge

 | merge(other)

Merge this columnprofile with another.

Parameters

other : ColumnProfile

Returns

merged : ColumnProfile A new, merged column profile.

to_protobuf

 | to_protobuf()

Return the object serialized as a protobuf message

Returns

message : ColumnMessage

from_protobuf

 | @staticmethod
| from_protobuf(message)

Load from a protobuf message

Returns

column_profile : ColumnProfile

Prefooter Illustration Mobile
Run AI With Certainty
Get started for free
Prefooter Illustration