apache_beam.transforms.combiners module¶
A library of basic combiner PTransform subclasses.
-
class
apache_beam.transforms.combiners.
Mean
[source]¶ Bases:
future.types.newobject.newobject
Combiners for computing arithmetic means of elements.
-
class
Globally
(label=None)[source]¶ Bases:
apache_beam.transforms.ptransform.PTransform
combiners.Mean.Globally computes the arithmetic mean of the elements.
-
class
PerKey
(label=None)[source]¶ Bases:
apache_beam.transforms.ptransform.PTransform
combiners.Mean.PerKey finds the means of the values for each key.
-
class
-
class
apache_beam.transforms.combiners.
Count
[source]¶ Bases:
future.types.newobject.newobject
Combiners for counting elements.
-
class
Globally
(label=None)[source]¶ Bases:
apache_beam.transforms.ptransform.PTransform
combiners.Count.Globally counts the total number of elements.
-
class
PerKey
(label=None)[source]¶ Bases:
apache_beam.transforms.ptransform.PTransform
combiners.Count.PerKey counts how many elements each unique key has.
-
class
PerElement
(label=None)[source]¶ Bases:
apache_beam.transforms.ptransform.PTransform
combiners.Count.PerElement counts how many times each element occurs.
-
class
-
class
apache_beam.transforms.combiners.
Top
[source]¶ Bases:
future.types.newobject.newobject
Combiners for obtaining extremal elements.
-
class
Of
(n, compare=None, *args, **kwargs)[source]¶ Bases:
apache_beam.transforms.ptransform.PTransform
Obtain a list of the compare-most N elements in a PCollection.
This transform will retrieve the n greatest elements in the PCollection to which it is applied, where “greatest” is determined by the comparator function supplied as the compare argument.
Initializer.
compare should be an implementation of “a < b” taking at least two arguments (a and b). Additional arguments and side inputs specified in the apply call become additional arguments to the comparator. Defaults to the natural ordering of the elements. The arguments ‘key’ and ‘reverse’ may instead be passed as keyword arguments, and have the same meaning as for Python’s sort functions.
Parameters: - pcoll – PCollection to process.
- n – number of elements to extract from pcoll.
- compare – as described above.
- *args – as described above.
- **kwargs – as described above.
-
class
PerKey
(n, compare=None, *args, **kwargs)[source]¶ Bases:
apache_beam.transforms.ptransform.PTransform
Identifies the compare-most N elements associated with each key.
This transform will produce a PCollection mapping unique keys in the input PCollection to the n greatest elements with which they are associated, where “greatest” is determined by the comparator function supplied as the compare argument in the initializer.
Initializer.
compare should be an implementation of “a < b” taking at least two arguments (a and b). Additional arguments and side inputs specified in the apply call become additional arguments to the comparator. Defaults to the natural ordering of the elements.
The arguments ‘key’ and ‘reverse’ may instead be passed as keyword arguments, and have the same meaning as for Python’s sort functions.
Parameters: - n – number of elements to extract from input.
- compare – as described above.
- *args – as described above.
- **kwargs – as described above.
-
static
Largest
(*args, **kwargs)¶
-
static
Smallest
(*args, **kwargs)¶
-
static
LargestPerKey
(*args, **kwargs)¶
-
static
SmallestPerKey
(*args, **kwargs)¶
-
class
-
class
apache_beam.transforms.combiners.
Sample
[source]¶ Bases:
future.types.newobject.newobject
Combiners for sampling n elements without replacement.
-
class
FixedSizeGlobally
(n)[source]¶ Bases:
apache_beam.transforms.ptransform.PTransform
Sample n elements from the input PCollection without replacement.
-
class
-
class
apache_beam.transforms.combiners.
ToList
(label='ToList')[source]¶ Bases:
apache_beam.transforms.ptransform.PTransform
A global CombineFn that condenses a PCollection into a single list.
-
class
apache_beam.transforms.combiners.
ToDict
(label='ToDict')[source]¶ Bases:
apache_beam.transforms.ptransform.PTransform
A global CombineFn that condenses a PCollection into a single dict.
PCollections should consist of 2-tuples, notionally (key, value) pairs. If multiple values are associated with the same key, only one of the values will be present in the resulting dict.