Class BatchStatefulParDoOverrides

java.lang.Object
org.apache.beam.runners.dataflow.BatchStatefulParDoOverrides

public class BatchStatefulParDoOverrides extends Object
PTransformOverrideFactories that expands to correctly implement stateful ParDo using window-unaware BatchViewOverrides.GroupByKeyAndSortValuesOnly to linearize processing per key.

For the Fn API, the PTransformOverrideFactory is only required to perform per key grouping and expansion.

This implementation relies on implementation details of the Dataflow runner, specifically standard fusion behavior of ParDo transforms following a GroupByKey.

  • Constructor Details

    • BatchStatefulParDoOverrides

      public BatchStatefulParDoOverrides()
  • Method Details

    • singleOutputOverrideFactory

      public static <K, InputT, OutputT> org.apache.beam.sdk.runners.PTransformOverrideFactory<PCollection<KV<K,InputT>>,PCollection<OutputT>,ParDo.SingleOutput<KV<K,InputT>,OutputT>> singleOutputOverrideFactory()
      Returns a PTransformOverrideFactory that replaces a single-output ParDo with a composite transform specialized for the DataflowRunner.
    • multiOutputOverrideFactory

      public static <K, InputT, OutputT> org.apache.beam.sdk.runners.PTransformOverrideFactory<PCollection<KV<K,InputT>>,PCollectionTuple,ParDo.MultiOutput<KV<K,InputT>,OutputT>> multiOutputOverrideFactory(DataflowPipelineOptions options)
      Returns a PTransformOverrideFactory that replaces a multi-output ParDo with a composite transform specialized for the DataflowRunner.