Class DoFnOperator<PreInputT,InputT,OutputT>
java.lang.Object
org.apache.flink.streaming.api.operators.AbstractStreamOperator<WindowedValue<OutputT>>
org.apache.beam.runners.flink.translation.wrappers.streaming.DoFnOperator<PreInputT,InputT,OutputT>
- All Implemented Interfaces:
Serializable
,org.apache.flink.api.common.state.CheckpointListener
,org.apache.flink.streaming.api.operators.Input<WindowedValue<PreInputT>>
,org.apache.flink.streaming.api.operators.KeyContext
,org.apache.flink.streaming.api.operators.KeyContextHandler
,org.apache.flink.streaming.api.operators.OneInputStreamOperator<WindowedValue<PreInputT>,
,WindowedValue<OutputT>> org.apache.flink.streaming.api.operators.SetupableStreamOperator<WindowedValue<OutputT>>
,org.apache.flink.streaming.api.operators.StreamOperator<WindowedValue<OutputT>>
,org.apache.flink.streaming.api.operators.StreamOperatorStateHandler.CheckpointedStreamOperator
,org.apache.flink.streaming.api.operators.Triggerable<FlinkKey,
,org.apache.beam.runners.core.TimerInternals.TimerData> org.apache.flink.streaming.api.operators.TwoInputStreamOperator<WindowedValue<PreInputT>,
RawUnionValue, WindowedValue<OutputT>>
- Direct Known Subclasses:
ExecutableStageDoFnOperator
,PartialReduceBundleOperator
,SplittableDoFnOperator
,WindowDoFnOperator
public class DoFnOperator<PreInputT,InputT,OutputT>
extends org.apache.flink.streaming.api.operators.AbstractStreamOperator<WindowedValue<OutputT>>
implements org.apache.flink.streaming.api.operators.OneInputStreamOperator<WindowedValue<PreInputT>,WindowedValue<OutputT>>, org.apache.flink.streaming.api.operators.TwoInputStreamOperator<WindowedValue<PreInputT>,RawUnionValue,WindowedValue<OutputT>>, org.apache.flink.streaming.api.operators.Triggerable<FlinkKey,org.apache.beam.runners.core.TimerInternals.TimerData>
Flink operator for executing
DoFns
.- See Also:
-
Nested Class Summary
Nested ClassesModifier and TypeClassDescriptionstatic class
AWindowedValueReceiver
that can buffer its outputs.protected class
StepContext
for runningDoFns
on Flink.static class
Implementation ofDoFnOperator.OutputManagerFactory
that creates anDoFnOperator.BufferedOutputManager
that can write to multiple logical outputs by Flink side output. -
Field Summary
FieldsModifier and TypeFieldDescriptionprotected BufferingDoFnRunner
<InputT, OutputT> protected FlinkStateInternals
<?> protected DoFnOperator.BufferedOutputManager
<OutputT> protected final org.apache.beam.runners.flink.translation.wrappers.streaming.DoFnOperator.OutputManagerFactory
<OutputT> protected final org.apache.beam.runners.core.construction.SerializablePipelineOptions
protected org.apache.beam.runners.core.SideInputHandler
protected org.apache.beam.runners.core.SideInputReader
protected final Collection
<PCollectionView<?>> protected final Map
<Integer, PCollectionView<?>> protected final String
protected DoFnOperator<PreInputT,
InputT, OutputT>.org.apache.beam.runners.flink.translation.wrappers.streaming.DoFnOperator.FlinkTimerInternals protected org.apache.flink.streaming.api.operators.InternalTimerService
<org.apache.beam.runners.core.TimerInternals.TimerData> protected final WindowingStrategy
<?, ?> Fields inherited from class org.apache.flink.streaming.api.operators.AbstractStreamOperator
chainingStrategy, config, latencyStats, metrics, output, processingTimeService
-
Constructor Summary
ConstructorsConstructorDescriptionDoFnOperator
(@Nullable DoFn<InputT, OutputT> doFn, String stepName, Coder<WindowedValue<InputT>> inputWindowedCoder, Map<TupleTag<?>, Coder<?>> outputCoders, TupleTag<OutputT> mainOutputTag, List<TupleTag<?>> additionalOutputTags, org.apache.beam.runners.flink.translation.wrappers.streaming.DoFnOperator.OutputManagerFactory<OutputT> outputManagerFactory, WindowingStrategy<?, ?> windowingStrategy, Map<Integer, PCollectionView<?>> sideInputTagMapping, Collection<PCollectionView<?>> sideInputs, PipelineOptions options, @Nullable Coder<?> keyCoder, @Nullable org.apache.flink.api.java.functions.KeySelector<WindowedValue<InputT>, ?> keySelector, DoFnSchemaInformation doFnSchemaInformation, Map<String, PCollectionView<?>> sideInputMapping) Constructor for DoFnOperator. -
Method Summary
Modifier and TypeMethodDescriptionprotected void
addSideInputValue
(org.apache.flink.streaming.runtime.streamrecord.StreamRecord<RawUnionValue> streamRecord) Add the side input value.long
applyInputWatermarkHold
(long inputWatermark) Allows to apply a hold to the input watermark.long
applyOutputWatermarkHold
(long currentOutputWatermark, long potentialOutputWatermark) Allows to apply a hold to the output watermark before it is sent out.void
close()
createWrappingDoFnRunner
(org.apache.beam.runners.core.DoFnRunner<InputT, OutputT> wrappedRunner, org.apache.beam.runners.core.StepContext stepContext) void
finish()
protected void
fireTimer
(org.apache.beam.runners.core.TimerInternals.TimerData timerData) protected void
fireTimerInternal
(FlinkKey key, org.apache.beam.runners.core.TimerInternals.TimerData timerData) long
getDoFn()
long
protected Lock
Subclasses may provide a lock to ensure that the state backend is not accessed concurrently during bundle execution.void
initializeState
(org.apache.flink.runtime.state.StateInitializationContext context) protected final void
void
notifyCheckpointComplete
(long checkpointId) protected int
void
onEventTime
(org.apache.flink.streaming.api.operators.InternalTimer<FlinkKey, org.apache.beam.runners.core.TimerInternals.TimerData> timer) void
onProcessingTime
(org.apache.flink.streaming.api.operators.InternalTimer<FlinkKey, org.apache.beam.runners.core.TimerInternals.TimerData> timer) void
open()
void
prepareSnapshotPreBarrier
(long checkpointId) protected Iterable
<WindowedValue<InputT>> preProcess
(WindowedValue<PreInputT> input) final void
processElement
(org.apache.flink.streaming.runtime.streamrecord.StreamRecord<WindowedValue<PreInputT>> streamRecord) final void
processElement1
(org.apache.flink.streaming.runtime.streamrecord.StreamRecord<WindowedValue<PreInputT>> streamRecord) final void
processElement2
(org.apache.flink.streaming.runtime.streamrecord.StreamRecord<RawUnionValue> streamRecord) final void
processWatermark
(org.apache.flink.streaming.api.watermark.Watermark mark) final void
processWatermark1
(org.apache.flink.streaming.api.watermark.Watermark mark) final void
processWatermark2
(org.apache.flink.streaming.api.watermark.Watermark mark) protected void
scheduleForCurrentProcessingTime
(org.apache.flink.api.common.operators.ProcessingTimeService.ProcessingTimeCallback callback) protected final void
setBundleFinishedCallback
(Runnable callback) protected final void
setPreBundleCallback
(Runnable callback) void
setup
(org.apache.flink.streaming.runtime.tasks.StreamTask<?, ?> containingTask, org.apache.flink.streaming.api.graph.StreamConfig config, org.apache.flink.streaming.api.operators.Output<org.apache.flink.streaming.runtime.streamrecord.StreamRecord<WindowedValue<OutputT>>> output) protected boolean
void
snapshotState
(org.apache.flink.runtime.state.StateSnapshotContext context) Methods inherited from class org.apache.flink.streaming.api.operators.AbstractStreamOperator
getChainingStrategy, getContainingTask, getCurrentKey, getExecutionConfig, getInternalTimerService, getKeyedStateBackend, getKeyedStateStore, getMetricGroup, getOperatorConfig, getOperatorID, getOperatorName, getOperatorStateBackend, getOrCreateKeyedState, getPartitionedState, getPartitionedState, getProcessingTimeService, getRuntimeContext, getTimeServiceManager, getUserCodeClassloader, hasKeyContext1, hasKeyContext2, initializeState, isUsingCustomRawKeyedState, notifyCheckpointAborted, processLatencyMarker, processLatencyMarker1, processLatencyMarker2, processWatermarkStatus, processWatermarkStatus1, processWatermarkStatus2, reportOrForwardLatencyMarker, setChainingStrategy, setCurrentKey, setKeyContextElement1, setKeyContextElement2, setProcessingTimeService, snapshotState
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
Methods inherited from interface org.apache.flink.api.common.state.CheckpointListener
notifyCheckpointAborted
Methods inherited from interface org.apache.flink.streaming.api.operators.Input
processLatencyMarker, processWatermarkStatus
Methods inherited from interface org.apache.flink.streaming.api.operators.KeyContext
getCurrentKey, setCurrentKey
Methods inherited from interface org.apache.flink.streaming.api.operators.KeyContextHandler
hasKeyContext
Methods inherited from interface org.apache.flink.streaming.api.operators.OneInputStreamOperator
setKeyContextElement
Methods inherited from interface org.apache.flink.streaming.api.operators.StreamOperator
getMetricGroup, getOperatorID, initializeState, setKeyContextElement1, setKeyContextElement2, snapshotState
Methods inherited from interface org.apache.flink.streaming.api.operators.TwoInputStreamOperator
processLatencyMarker1, processLatencyMarker2, processWatermarkStatus1, processWatermarkStatus2
-
Field Details
-
doFn
-
serializedOptions
protected final org.apache.beam.runners.core.construction.SerializablePipelineOptions serializedOptions -
mainOutputTag
-
additionalOutputTags
-
sideInputs
-
sideInputTagMapping
-
windowingStrategy
-
outputManagerFactory
protected final org.apache.beam.runners.flink.translation.wrappers.streaming.DoFnOperator.OutputManagerFactory<OutputT> outputManagerFactory -
doFnRunner
-
pushbackDoFnRunner
-
bufferingDoFnRunner
-
sideInputHandler
protected transient org.apache.beam.runners.core.SideInputHandler sideInputHandler -
sideInputReader
protected transient org.apache.beam.runners.core.SideInputReader sideInputReader -
outputManager
-
keyedStateInternals
-
timerInternals
protected transient DoFnOperator<PreInputT,InputT, timerInternalsOutputT>.org.apache.beam.runners.flink.translation.wrappers.streaming.DoFnOperator.FlinkTimerInternals -
stepName
-
timerService
protected transient org.apache.flink.streaming.api.operators.InternalTimerService<org.apache.beam.runners.core.TimerInternals.TimerData> timerService
-
-
Constructor Details
-
DoFnOperator
public DoFnOperator(@Nullable DoFn<InputT, OutputT> doFn, String stepName, Coder<WindowedValue<InputT>> inputWindowedCoder, Map<TupleTag<?>, Coder<?>> outputCoders, TupleTag<OutputT> mainOutputTag, List<TupleTag<?>> additionalOutputTags, org.apache.beam.runners.flink.translation.wrappers.streaming.DoFnOperator.OutputManagerFactory<OutputT> outputManagerFactory, WindowingStrategy<?, ?> windowingStrategy, Map<Integer, PCollectionView<?>> sideInputTagMapping, Collection<PCollectionView<?>> sideInputs, PipelineOptions options, @Nullable Coder<?> keyCoder, @Nullable org.apache.flink.api.java.functions.KeySelector<WindowedValue<InputT>, ?> keySelector, DoFnSchemaInformation doFnSchemaInformation, Map<String, PCollectionView<?>> sideInputMapping) Constructor for DoFnOperator.
-
-
Method Details
-
getDoFn
-
preProcess
-
createWrappingDoFnRunner
-
setup
public void setup(org.apache.flink.streaming.runtime.tasks.StreamTask<?, ?> containingTask, org.apache.flink.streaming.api.graph.StreamConfig config, org.apache.flink.streaming.api.operators.Output<org.apache.flink.streaming.runtime.streamrecord.StreamRecord<WindowedValue<OutputT>>> output) - Specified by:
setup
in interfaceorg.apache.flink.streaming.api.operators.SetupableStreamOperator<PreInputT>
- Overrides:
setup
in classorg.apache.flink.streaming.api.operators.AbstractStreamOperator<WindowedValue<OutputT>>
-
shoudBundleElements
protected boolean shoudBundleElements() -
initializeState
public void initializeState(org.apache.flink.runtime.state.StateInitializationContext context) throws Exception - Specified by:
initializeState
in interfaceorg.apache.flink.streaming.api.operators.StreamOperatorStateHandler.CheckpointedStreamOperator
- Overrides:
initializeState
in classorg.apache.flink.streaming.api.operators.AbstractStreamOperator<WindowedValue<OutputT>>
- Throws:
Exception
-
getLockToAcquireForStateAccessDuringBundles
Subclasses may provide a lock to ensure that the state backend is not accessed concurrently during bundle execution. -
open
- Specified by:
open
in interfaceorg.apache.flink.streaming.api.operators.StreamOperator<PreInputT>
- Overrides:
open
in classorg.apache.flink.streaming.api.operators.AbstractStreamOperator<WindowedValue<OutputT>>
- Throws:
Exception
-
finish
- Specified by:
finish
in interfaceorg.apache.flink.streaming.api.operators.StreamOperator<PreInputT>
- Overrides:
finish
in classorg.apache.flink.streaming.api.operators.AbstractStreamOperator<WindowedValue<OutputT>>
- Throws:
Exception
-
close
- Specified by:
close
in interfaceorg.apache.flink.streaming.api.operators.StreamOperator<PreInputT>
- Overrides:
close
in classorg.apache.flink.streaming.api.operators.AbstractStreamOperator<WindowedValue<OutputT>>
- Throws:
Exception
-
numProcessingTimeTimers
protected int numProcessingTimeTimers() -
getEffectiveInputWatermark
public long getEffectiveInputWatermark() -
getCurrentOutputWatermark
public long getCurrentOutputWatermark() -
setPreBundleCallback
-
setBundleFinishedCallback
-
processElement
public final void processElement(org.apache.flink.streaming.runtime.streamrecord.StreamRecord<WindowedValue<PreInputT>> streamRecord) - Specified by:
processElement
in interfaceorg.apache.flink.streaming.api.operators.Input<PreInputT>
-
processElement1
public final void processElement1(org.apache.flink.streaming.runtime.streamrecord.StreamRecord<WindowedValue<PreInputT>> streamRecord) throws Exception -
addSideInputValue
protected void addSideInputValue(org.apache.flink.streaming.runtime.streamrecord.StreamRecord<RawUnionValue> streamRecord) Add the side input value. Here we are assuming that views have already been materialized and are sent over the wire asIterable
. Subclasses may elect to perform materialization in state and receive side input incrementally instead.- Parameters:
streamRecord
-
-
processElement2
public final void processElement2(org.apache.flink.streaming.runtime.streamrecord.StreamRecord<RawUnionValue> streamRecord) throws Exception -
processWatermark
public final void processWatermark(org.apache.flink.streaming.api.watermark.Watermark mark) throws Exception - Specified by:
processWatermark
in interfaceorg.apache.flink.streaming.api.operators.Input<PreInputT>
- Overrides:
processWatermark
in classorg.apache.flink.streaming.api.operators.AbstractStreamOperator<WindowedValue<OutputT>>
- Throws:
Exception
-
processWatermark1
public final void processWatermark1(org.apache.flink.streaming.api.watermark.Watermark mark) throws Exception -
applyInputWatermarkHold
public long applyInputWatermarkHold(long inputWatermark) Allows to apply a hold to the input watermark. By default, just passes the input watermark through. -
applyOutputWatermarkHold
public long applyOutputWatermarkHold(long currentOutputWatermark, long potentialOutputWatermark) Allows to apply a hold to the output watermark before it is sent out. Used to apply hold on output watermark for delayed (asynchronous or buffered) processing.- Parameters:
currentOutputWatermark
- the current output watermarkpotentialOutputWatermark
- The potential new output watermark which can be adjusted, if needed. The input watermark hold has already been applied.- Returns:
- The new output watermark which will be emitted.
-
processWatermark2
public final void processWatermark2(org.apache.flink.streaming.api.watermark.Watermark mark) throws Exception -
scheduleForCurrentProcessingTime
protected void scheduleForCurrentProcessingTime(org.apache.flink.api.common.operators.ProcessingTimeService.ProcessingTimeCallback callback) -
invokeFinishBundle
protected final void invokeFinishBundle() -
prepareSnapshotPreBarrier
public void prepareSnapshotPreBarrier(long checkpointId) - Specified by:
prepareSnapshotPreBarrier
in interfaceorg.apache.flink.streaming.api.operators.StreamOperator<PreInputT>
- Overrides:
prepareSnapshotPreBarrier
in classorg.apache.flink.streaming.api.operators.AbstractStreamOperator<WindowedValue<OutputT>>
-
snapshotState
public void snapshotState(org.apache.flink.runtime.state.StateSnapshotContext context) throws Exception - Specified by:
snapshotState
in interfaceorg.apache.flink.streaming.api.operators.StreamOperatorStateHandler.CheckpointedStreamOperator
- Overrides:
snapshotState
in classorg.apache.flink.streaming.api.operators.AbstractStreamOperator<WindowedValue<OutputT>>
- Throws:
Exception
-
getBundleFinalizer
-
notifyCheckpointComplete
- Specified by:
notifyCheckpointComplete
in interfaceorg.apache.flink.api.common.state.CheckpointListener
- Overrides:
notifyCheckpointComplete
in classorg.apache.flink.streaming.api.operators.AbstractStreamOperator<WindowedValue<OutputT>>
- Throws:
Exception
-
onEventTime
public void onEventTime(org.apache.flink.streaming.api.operators.InternalTimer<FlinkKey, org.apache.beam.runners.core.TimerInternals.TimerData> timer) -
onProcessingTime
public void onProcessingTime(org.apache.flink.streaming.api.operators.InternalTimer<FlinkKey, org.apache.beam.runners.core.TimerInternals.TimerData> timer) -
fireTimerInternal
protected void fireTimerInternal(FlinkKey key, org.apache.beam.runners.core.TimerInternals.TimerData timerData) -
fireTimer
protected void fireTimer(org.apache.beam.runners.core.TimerInternals.TimerData timerData)
-