Package org.apache.beam.runners.dataflow
Class BatchStatefulParDoOverrides
java.lang.Object
org.apache.beam.runners.dataflow.BatchStatefulParDoOverrides
PTransformOverrideFactories
that expands to correctly implement
stateful ParDo
using window-unaware BatchViewOverrides.GroupByKeyAndSortValuesOnly
to linearize
processing per key.
For the Fn API, the PTransformOverrideFactory
is only required to perform per key
grouping and expansion.
This implementation relies on implementation details of the Dataflow runner, specifically
standard fusion behavior of ParDo
transforms following a GroupByKey
.
-
Nested Class Summary
Nested ClassesModifier and TypeClassDescriptionstatic class
A key-preservingDoFn
that explodes an iterable that has been grouped by key and window. -
Constructor Summary
Constructors -
Method Summary
Modifier and TypeMethodDescriptionstatic <K,
InputT, OutputT>
org.apache.beam.sdk.runners.PTransformOverrideFactory<PCollection<KV<K, InputT>>, PCollectionTuple, ParDo.MultiOutput<KV<K, InputT>, OutputT>> Returns aPTransformOverrideFactory
that replaces a multi-outputParDo
with a composite transform specialized for theDataflowRunner
.static <K,
InputT, OutputT>
org.apache.beam.sdk.runners.PTransformOverrideFactory<PCollection<KV<K, InputT>>, PCollection<OutputT>, ParDo.SingleOutput<KV<K, InputT>, OutputT>> Returns aPTransformOverrideFactory
that replaces a single-outputParDo
with a composite transform specialized for theDataflowRunner
.
-
Constructor Details
-
BatchStatefulParDoOverrides
public BatchStatefulParDoOverrides()
-
-
Method Details
-
singleOutputOverrideFactory
public static <K,InputT, org.apache.beam.sdk.runners.PTransformOverrideFactory<PCollection<KV<K,OutputT> InputT>>, singleOutputOverrideFactory()PCollection<OutputT>, ParDo.SingleOutput<KV<K, InputT>, OutputT>> Returns aPTransformOverrideFactory
that replaces a single-outputParDo
with a composite transform specialized for theDataflowRunner
. -
multiOutputOverrideFactory
public static <K,InputT, org.apache.beam.sdk.runners.PTransformOverrideFactory<PCollection<KV<K,OutputT> InputT>>, multiOutputOverrideFactoryPCollectionTuple, ParDo.MultiOutput<KV<K, InputT>, OutputT>> (DataflowPipelineOptions options) Returns aPTransformOverrideFactory
that replaces a multi-outputParDo
with a composite transform specialized for theDataflowRunner
.
-