public class BatchStatefulParDoOverrides
extends java.lang.Object
PTransformOverrideFactories
that expands to correctly implement
stateful ParDo
using window-unaware BatchViewOverrides.GroupByKeyAndSortValuesOnly
to linearize
processing per key.
For the Fn API, the PTransformOverrideFactory
is only required to perform
per key grouping and expansion.
This implementation relies on implementation details of the Dataflow runner, specifically
standard fusion behavior of ParDo
tranforms following a GroupByKey
.
Modifier and Type | Class and Description |
---|---|
static class |
BatchStatefulParDoOverrides.BatchStatefulDoFn<K,V,OutputT>
A key-preserving
DoFn that explodes an iterable that has been grouped by key and
window. |
Constructor and Description |
---|
BatchStatefulParDoOverrides() |
Modifier and Type | Method and Description |
---|---|
static <K,InputT,OutputT> |
multiOutputOverrideFactory(DataflowPipelineOptions options)
Returns a
PTransformOverrideFactory that replaces a multi-output
ParDo with a composite transform specialized for the DataflowRunner . |
static <K,InputT,OutputT> |
singleOutputOverrideFactory(DataflowPipelineOptions options)
Returns a
PTransformOverrideFactory that replaces a single-output ParDo with a
composite transform specialized for the DataflowRunner . |
public static <K,InputT,OutputT> org.apache.beam.sdk.runners.PTransformOverrideFactory<PCollection<KV<K,InputT>>,PCollection<OutputT>,ParDo.SingleOutput<KV<K,InputT>,OutputT>> singleOutputOverrideFactory(DataflowPipelineOptions options)
PTransformOverrideFactory
that replaces a single-output ParDo
with a
composite transform specialized for the DataflowRunner
.public static <K,InputT,OutputT> org.apache.beam.sdk.runners.PTransformOverrideFactory<PCollection<KV<K,InputT>>,PCollectionTuple,ParDo.MultiOutput<KV<K,InputT>,OutputT>> multiOutputOverrideFactory(DataflowPipelineOptions options)
PTransformOverrideFactory
that replaces a multi-output
ParDo
with a composite transform specialized for the DataflowRunner
.