Max

Pydoc Pydoc




Gets the element with the maximum value within each aggregation.

Examples

In the following example, we create a pipeline with a PCollection. Then, we get the element with the maximum value in different ways.

Example 1: Maximum element in a PCollection

We use Combine.Globally() to get the maximum element from the entire PCollection.

import apache_beam as beam

with beam.Pipeline() as pipeline:
  max_element = (
      pipeline
      | 'Create numbers' >> beam.Create([3, 4, 1, 2])
      | 'Get max value' >>
      beam.CombineGlobally(lambda elements: max(elements or [None]))
      | beam.Map(print))

Output:

4

Example 2: Maximum elements for each key

We use Combine.PerKey() to get the maximum element for each unique key in a PCollection of key-values.

import apache_beam as beam

with beam.Pipeline() as pipeline:
  elements_with_max_value_per_key = (
      pipeline
      | 'Create produce' >> beam.Create([
          ('🥕', 3),
          ('🥕', 2),
          ('🍆', 1),
          ('🍅', 4),
          ('🍅', 5),
          ('🍅', 3),
      ])
      | 'Get max value per key' >> beam.CombinePerKey(max)
      | beam.Map(print))

Output:

('🥕', 3)
('🍆', 1)
('🍅', 5)
Pydoc Pydoc