apache_beam.transforms.stats module¶
This module has all statistic related transforms.
-
class
apache_beam.transforms.stats.
ApproximateUnique
[source]¶ Bases:
object
Hashes input elements and uses those to extrapolate the size of the entire set of hash values by assuming the rest of the hash values are as densely distributed as the sample space.
-
static
parse_input_params
(size=None, error=None)[source]¶ Check if input params are valid and return sample size.
Parameters: - size – an int not smaller than 16, which we would use to estimate number of unique values.
- error – max estimation error, which is a float between 0.01 and 0.50. If error is given, sample size will be calculated from error with _get_sample_size_from_est_error function.
Returns: sample size
Raises: ValueError: If both size and error are given, or neither is given, or values are out of range.
-
class
Globally
(size=None, error=None)[source]¶ Bases:
apache_beam.transforms.ptransform.PTransform
Approximate.Globally approximate number of unique values
-
class
PerKey
(size=None, error=None)[source]¶ Bases:
apache_beam.transforms.ptransform.PTransform
Approximate.PerKey approximate number of unique values per key
-
static