Mean

Pydoc Pydoc




Transforms for computing the arithmetic mean of the elements in a collection, or the mean of the values associated with each key in a collection of key-value pairs.

Examples

In the following example, we create a pipeline with a PCollection. Then, we get the element with the average value in different ways.

Example 1: Mean of element in a PCollection

We use Mean.Globally() to get the average of the elements from the entire PCollection.

import apache_beam as beam

with beam.Pipeline() as pipeline:
  mean_element = (
      pipeline
      | 'Create numbers' >> beam.Create([3, 4, 1, 2])
      | 'Get mean value' >> beam.combiners.Mean.Globally()
      | beam.Map(print))

Output:

2.5

Example 2: Mean of elements for each key

We use Mean.PerKey() to get the average of the elements for each unique key in a PCollection of key-values.

import apache_beam as beam

with beam.Pipeline() as pipeline:
  elements_with_mean_value_per_key = (
      pipeline
      | 'Create produce' >> beam.Create([
          ('πŸ₯•', 3),
          ('πŸ₯•', 2),
          ('πŸ†', 1),
          ('πŸ…', 4),
          ('πŸ…', 5),
          ('πŸ…', 3),
      ])
      | 'Get mean value per key' >> beam.combiners.Mean.PerKey()
      | beam.Map(print))

Output:

('πŸ₯•', 2.5)
('πŸ†', 1.0)
('πŸ…', 4.0)
Pydoc Pydoc