public class BatchStatefulParDoOverrides
extends java.lang.Object
PTransformOverrideFactories
that expands to correctly implement
stateful ParDo
using window-unaware BatchViewOverrides.GroupByKeyAndSortValuesOnly
to linearize
processing per key.
For the Fn API, the PTransformOverrideFactory
is only required to perform per key
grouping and expansion.
This implementation relies on implementation details of the Dataflow runner, specifically
standard fusion behavior of ParDo
tranforms following a GroupByKey
.
Modifier and Type | Class and Description |
---|---|
static class |
BatchStatefulParDoOverrides.BatchStatefulDoFn<K,V,OutputT>
A key-preserving
DoFn that explodes an iterable that has been grouped by key and
window. |
Constructor and Description |
---|
BatchStatefulParDoOverrides() |
Modifier and Type | Method and Description |
---|---|
static <K,InputT,OutputT> |
multiOutputOverrideFactory(DataflowPipelineOptions options)
Returns a
PTransformOverrideFactory that replaces a multi-output ParDo with a
composite transform specialized for the DataflowRunner . |
static <K,InputT,OutputT> |
singleOutputOverrideFactory()
Returns a
PTransformOverrideFactory that replaces a single-output ParDo with a
composite transform specialized for the DataflowRunner . |
public static <K,InputT,OutputT> org.apache.beam.sdk.runners.PTransformOverrideFactory<PCollection<KV<K,InputT>>,PCollection<OutputT>,ParDo.SingleOutput<KV<K,InputT>,OutputT>> singleOutputOverrideFactory()
PTransformOverrideFactory
that replaces a single-output ParDo
with a
composite transform specialized for the DataflowRunner
.public static <K,InputT,OutputT> org.apache.beam.sdk.runners.PTransformOverrideFactory<PCollection<KV<K,InputT>>,PCollectionTuple,ParDo.MultiOutput<KV<K,InputT>,OutputT>> multiOutputOverrideFactory(DataflowPipelineOptions options)
PTransformOverrideFactory
that replaces a multi-output ParDo
with a
composite transform specialized for the DataflowRunner
.