Class Combine

java.lang.Object
org.apache.beam.sdk.transforms.Combine

public class Combine extends Object
PTransforms for combining PCollection elements globally and per-key.

See the documentation for how to use the operations in this class.

  • Method Details

    • globally

      public static <V> Combine.Globally<V,V> globally(SerializableFunction<Iterable<V>,V> combiner)
      Returns a Combine.Globally PTransform that uses the given SerializableFunction to combine all the elements in each window of the input PCollection into a single value in the output PCollection. The types of the input elements and the output elements must be the same.

      If the input PCollection is windowed into GlobalWindows, a default value in the GlobalWindow will be output if the input PCollection is empty. To use this with inputs with other windowing, either Combine.Globally.withoutDefaults() or Combine.Globally.asSingletonView() must be called.

      See Combine.Globally for more information.

    • globally

      public static <V> Combine.Globally<V,V> globally(SerializableBiFunction<V,V,V> combiner)
      Returns a Combine.Globally PTransform that uses the given SerializableBiFunction to combine all the elements in each window of the input PCollection into a single value in the output PCollection. The types of the input elements and the output elements must be the same.

      If the input PCollection is windowed into GlobalWindows, a default value in the GlobalWindow will be output if the input PCollection is empty. To use this with inputs with other windowing, either Combine.Globally.withoutDefaults() or Combine.Globally.asSingletonView() must be called.

      See Combine.Globally for more information.

    • globally

      public static <InputT, OutputT> Combine.Globally<InputT,OutputT> globally(CombineFnBase.GlobalCombineFn<? super InputT,?,OutputT> fn)
      Returns a Combine.Globally PTransform that uses the given GloballyCombineFn to combine all the elements in each window of the input PCollection into a single value in the output PCollection. The types of the input elements and the output elements can differ.

      If the input PCollection is windowed into GlobalWindows, a default value in the GlobalWindow will be output if the input PCollection is empty. To use this with inputs with other windowing, either Combine.Globally.withoutDefaults() or Combine.Globally.asSingletonView() must be called.

      See Combine.Globally for more information.

    • perKey

      public static <K, V> Combine.PerKey<K,V,V> perKey(SerializableFunction<Iterable<V>,V> fn)
      Returns a Combine.PerKey PTransform that first groups its input PCollection of KVs by keys and windows, then invokes the given function on each of the values lists to produce a combined value, and then returns a PCollection of KVs mapping each distinct key to its combined value for each window.

      Each output element is in the window by which its corresponding input was grouped, and has the timestamp of the end of that window. The output PCollection has the same WindowFn as the input.

      See Combine.PerKey for more information.

    • perKey

      public static <K, V> Combine.PerKey<K,V,V> perKey(SerializableBiFunction<V,V,V> fn)
      Returns a Combine.PerKey PTransform that first groups its input PCollection of KVs by keys and windows, then invokes the given function on each of the values lists to produce a combined value, and then returns a PCollection of KVs mapping each distinct key to its combined value for each window.

      Each output element is in the window by which its corresponding input was grouped, and has the timestamp of the end of that window. The output PCollection has the same WindowFn as the input.

      See Combine.PerKey for more information.

    • perKey

      public static <K, InputT, OutputT> Combine.PerKey<K,InputT,OutputT> perKey(CombineFnBase.GlobalCombineFn<? super InputT,?,OutputT> fn)
      Returns a Combine.PerKey PTransform that first groups its input PCollection of KVs by keys and windows, then invokes the given function on each of the values lists to produce a combined value, and then returns a PCollection of KVs mapping each distinct key to its combined value for each window.

      Each output element is in the window by which its corresponding input was grouped, and has the timestamp of the end of that window. The output PCollection has the same WindowFn as the input.

      See Combine.PerKey for more information.

    • fewKeys

      @Internal public static <K, InputT, OutputT> Combine.PerKey<K,InputT,OutputT> fewKeys(CombineFnBase.GlobalCombineFn<? super InputT,?,OutputT> fn)
      Returns a Combine.PerKey, and set fewKeys in GroupByKey.
    • groupedValues

      public static <K, V> Combine.GroupedValues<K,V,V> groupedValues(SerializableFunction<Iterable<V>,V> fn)
      Returns a Combine.GroupedValues PTransform that takes a PCollection of KVs where a key maps to an Iterable of values, e.g., the result of a GroupByKey, then uses the given SerializableFunction to combine all the values associated with a key, ignoring the key. The type of the input and output values must be the same.

      Each output element has the same timestamp and is in the same window as its corresponding input element, and the output PCollection has the same WindowFn associated with it as the input.

      See Combine.GroupedValues for more information.

      Note that perKey(SerializableFunction) is typically more convenient to use than GroupByKey followed by groupedValues(...).

    • groupedValues

      public static <K, V> Combine.GroupedValues<K,V,V> groupedValues(SerializableBiFunction<V,V,V> fn)
      Returns a Combine.GroupedValues PTransform that takes a PCollection of KVs where a key maps to an Iterable of values, e.g., the result of a GroupByKey, then uses the given SerializableFunction to combine all the values associated with a key, ignoring the key. The type of the input and output values must be the same.

      Each output element has the same timestamp and is in the same window as its corresponding input element, and the output PCollection has the same WindowFn associated with it as the input.

      See Combine.GroupedValues for more information.

      Note that perKey(SerializableBiFunction) is typically more convenient to use than GroupByKey followed by groupedValues(...).

    • groupedValues

      public static <K, InputT, OutputT> Combine.GroupedValues<K,InputT,OutputT> groupedValues(CombineFnBase.GlobalCombineFn<? super InputT,?,OutputT> fn)
      Returns a Combine.GroupedValues PTransform that takes a PCollection of KVs where a key maps to an Iterable of values, e.g., the result of a GroupByKey, then uses the given CombineFn to combine all the values associated with a key, ignoring the key. The types of the input and output values can differ.

      Each output element has the same timestamp and is in the same window as its corresponding input element, and the output PCollection has the same WindowFn associated with it as the input.

      See Combine.GroupedValues for more information.

      Note that perKey(CombineFnBase.GlobalCombineFn) is typically more convenient to use than GroupByKey followed by groupedValues(...).