Package org.apache.beam.runners.dataflow
Class BatchStatefulParDoOverrides
java.lang.Object
org.apache.beam.runners.dataflow.BatchStatefulParDoOverrides
PTransformOverrideFactories that expands to correctly implement
stateful ParDo using window-unaware BatchViewOverrides.GroupByKeyAndSortValuesOnly to linearize
processing per key.
For the Fn API, the PTransformOverrideFactory is only required to perform per key
grouping and expansion.
This implementation relies on implementation details of the Dataflow runner, specifically
standard fusion behavior of ParDo transforms following a GroupByKey.
-
Nested Class Summary
Nested ClassesModifier and TypeClassDescriptionstatic classA key-preservingDoFnthat explodes an iterable that has been grouped by key and window. -
Constructor Summary
Constructors -
Method Summary
Modifier and TypeMethodDescriptionstatic <K,InputT, OutputT>
org.apache.beam.sdk.runners.PTransformOverrideFactory<PCollection<KV<K, InputT>>, PCollectionTuple, ParDo.MultiOutput<KV<K, InputT>, OutputT>> Returns aPTransformOverrideFactorythat replaces a multi-outputParDowith a composite transform specialized for theDataflowRunner.static <K,InputT, OutputT>
org.apache.beam.sdk.runners.PTransformOverrideFactory<PCollection<KV<K, InputT>>, PCollection<OutputT>, ParDo.SingleOutput<KV<K, InputT>, OutputT>> Returns aPTransformOverrideFactorythat replaces a single-outputParDowith a composite transform specialized for theDataflowRunner.
-
Constructor Details
-
BatchStatefulParDoOverrides
public BatchStatefulParDoOverrides()
-
-
Method Details
-
singleOutputOverrideFactory
public static <K,InputT, org.apache.beam.sdk.runners.PTransformOverrideFactory<PCollection<KV<K,OutputT> InputT>>, singleOutputOverrideFactory()PCollection<OutputT>, ParDo.SingleOutput<KV<K, InputT>, OutputT>> Returns aPTransformOverrideFactorythat replaces a single-outputParDowith a composite transform specialized for theDataflowRunner. -
multiOutputOverrideFactory
public static <K,InputT, org.apache.beam.sdk.runners.PTransformOverrideFactory<PCollection<KV<K,OutputT> InputT>>, multiOutputOverrideFactoryPCollectionTuple, ParDo.MultiOutput<KV<K, InputT>, OutputT>> (DataflowPipelineOptions options) Returns aPTransformOverrideFactorythat replaces a multi-outputParDowith a composite transform specialized for theDataflowRunner.
-