Class DoFnOperator<PreInputT,InputT,OutputT>
java.lang.Object
org.apache.flink.streaming.api.operators.AbstractStreamOperator<WindowedValue<OutputT>>
org.apache.beam.runners.flink.translation.wrappers.streaming.DoFnOperator<PreInputT,InputT,OutputT>
- All Implemented Interfaces:
Serializable,org.apache.flink.api.common.state.CheckpointListener,org.apache.flink.streaming.api.operators.Input<WindowedValue<PreInputT>>,org.apache.flink.streaming.api.operators.KeyContext,org.apache.flink.streaming.api.operators.KeyContextHandler,org.apache.flink.streaming.api.operators.OneInputStreamOperator<WindowedValue<PreInputT>,,WindowedValue<OutputT>> org.apache.flink.streaming.api.operators.SetupableStreamOperator<WindowedValue<OutputT>>,org.apache.flink.streaming.api.operators.StreamOperator<WindowedValue<OutputT>>,org.apache.flink.streaming.api.operators.StreamOperatorStateHandler.CheckpointedStreamOperator,org.apache.flink.streaming.api.operators.Triggerable<FlinkKey,,org.apache.beam.runners.core.TimerInternals.TimerData> org.apache.flink.streaming.api.operators.TwoInputStreamOperator<WindowedValue<PreInputT>,RawUnionValue, WindowedValue<OutputT>>
- Direct Known Subclasses:
ExecutableStageDoFnOperator,PartialReduceBundleOperator,SplittableDoFnOperator,WindowDoFnOperator
public class DoFnOperator<PreInputT,InputT,OutputT>
extends org.apache.flink.streaming.api.operators.AbstractStreamOperator<WindowedValue<OutputT>>
implements org.apache.flink.streaming.api.operators.OneInputStreamOperator<WindowedValue<PreInputT>,WindowedValue<OutputT>>, org.apache.flink.streaming.api.operators.TwoInputStreamOperator<WindowedValue<PreInputT>,RawUnionValue,WindowedValue<OutputT>>, org.apache.flink.streaming.api.operators.Triggerable<FlinkKey,org.apache.beam.runners.core.TimerInternals.TimerData>
Flink operator for executing
DoFns.- See Also:
-
Nested Class Summary
Nested ClassesModifier and TypeClassDescriptionstatic classAWindowedValueReceiverthat can buffer its outputs.protected classStepContextfor runningDoFnson Flink.static classImplementation ofDoFnOperator.OutputManagerFactorythat creates anDoFnOperator.BufferedOutputManagerthat can write to multiple logical outputs by Flink side output. -
Field Summary
FieldsModifier and TypeFieldDescriptionprotected BufferingDoFnRunner<InputT, OutputT> protected FlinkStateInternals<?> protected DoFnOperator.BufferedOutputManager<OutputT> protected final org.apache.beam.runners.flink.translation.wrappers.streaming.DoFnOperator.OutputManagerFactory<OutputT> protected final org.apache.beam.runners.core.construction.SerializablePipelineOptionsprotected org.apache.beam.runners.core.SideInputHandlerprotected org.apache.beam.runners.core.SideInputReaderprotected final Collection<PCollectionView<?>> protected final Map<Integer, PCollectionView<?>> protected final Stringprotected DoFnOperator<PreInputT,InputT, OutputT>.org.apache.beam.runners.flink.translation.wrappers.streaming.DoFnOperator.FlinkTimerInternals protected org.apache.flink.streaming.api.operators.InternalTimerService<org.apache.beam.runners.core.TimerInternals.TimerData> protected final WindowingStrategy<?, ?> Fields inherited from class org.apache.flink.streaming.api.operators.AbstractStreamOperator
chainingStrategy, config, latencyStats, metrics, output, processingTimeService -
Constructor Summary
ConstructorsConstructorDescriptionDoFnOperator(@Nullable DoFn<InputT, OutputT> doFn, String stepName, Coder<WindowedValue<InputT>> inputWindowedCoder, Map<TupleTag<?>, Coder<?>> outputCoders, TupleTag<OutputT> mainOutputTag, List<TupleTag<?>> additionalOutputTags, org.apache.beam.runners.flink.translation.wrappers.streaming.DoFnOperator.OutputManagerFactory<OutputT> outputManagerFactory, WindowingStrategy<?, ?> windowingStrategy, Map<Integer, PCollectionView<?>> sideInputTagMapping, Collection<PCollectionView<?>> sideInputs, PipelineOptions options, @Nullable Coder<?> keyCoder, @Nullable org.apache.flink.api.java.functions.KeySelector<WindowedValue<InputT>, ?> keySelector, DoFnSchemaInformation doFnSchemaInformation, Map<String, PCollectionView<?>> sideInputMapping) Constructor for DoFnOperator. -
Method Summary
Modifier and TypeMethodDescriptionprotected voidaddSideInputValue(org.apache.flink.streaming.runtime.streamrecord.StreamRecord<RawUnionValue> streamRecord) Add the side input value.longapplyInputWatermarkHold(long inputWatermark) Allows to apply a hold to the input watermark.longapplyOutputWatermarkHold(long currentOutputWatermark, long potentialOutputWatermark) Allows to apply a hold to the output watermark before it is sent out.voidclose()createWrappingDoFnRunner(org.apache.beam.runners.core.DoFnRunner<InputT, OutputT> wrappedRunner, org.apache.beam.runners.core.StepContext stepContext) voidfinish()protected voidfireTimer(org.apache.beam.runners.core.TimerInternals.TimerData timerData) protected voidfireTimerInternal(FlinkKey key, org.apache.beam.runners.core.TimerInternals.TimerData timerData) longgetDoFn()longprotected LockSubclasses may provide a lock to ensure that the state backend is not accessed concurrently during bundle execution.voidinitializeState(org.apache.flink.runtime.state.StateInitializationContext context) protected final voidvoidnotifyCheckpointComplete(long checkpointId) protected intvoidonEventTime(org.apache.flink.streaming.api.operators.InternalTimer<FlinkKey, org.apache.beam.runners.core.TimerInternals.TimerData> timer) voidonProcessingTime(org.apache.flink.streaming.api.operators.InternalTimer<FlinkKey, org.apache.beam.runners.core.TimerInternals.TimerData> timer) voidopen()voidprepareSnapshotPreBarrier(long checkpointId) protected Iterable<WindowedValue<InputT>> preProcess(WindowedValue<PreInputT> input) final voidprocessElement(org.apache.flink.streaming.runtime.streamrecord.StreamRecord<WindowedValue<PreInputT>> streamRecord) final voidprocessElement1(org.apache.flink.streaming.runtime.streamrecord.StreamRecord<WindowedValue<PreInputT>> streamRecord) final voidprocessElement2(org.apache.flink.streaming.runtime.streamrecord.StreamRecord<RawUnionValue> streamRecord) final voidprocessWatermark(org.apache.flink.streaming.api.watermark.Watermark mark) final voidprocessWatermark1(org.apache.flink.streaming.api.watermark.Watermark mark) final voidprocessWatermark2(org.apache.flink.streaming.api.watermark.Watermark mark) protected voidscheduleForCurrentProcessingTime(org.apache.flink.api.common.operators.ProcessingTimeService.ProcessingTimeCallback callback) protected final voidsetBundleFinishedCallback(Runnable callback) protected final voidsetPreBundleCallback(Runnable callback) voidsetup(org.apache.flink.streaming.runtime.tasks.StreamTask<?, ?> containingTask, org.apache.flink.streaming.api.graph.StreamConfig config, org.apache.flink.streaming.api.operators.Output<org.apache.flink.streaming.runtime.streamrecord.StreamRecord<WindowedValue<OutputT>>> output) protected booleanvoidsnapshotState(org.apache.flink.runtime.state.StateSnapshotContext context) Methods inherited from class org.apache.flink.streaming.api.operators.AbstractStreamOperator
getChainingStrategy, getContainingTask, getCurrentKey, getExecutionConfig, getInternalTimerService, getKeyedStateBackend, getKeyedStateStore, getMetricGroup, getOperatorConfig, getOperatorID, getOperatorName, getOperatorStateBackend, getOrCreateKeyedState, getPartitionedState, getPartitionedState, getProcessingTimeService, getRuntimeContext, getTimeServiceManager, getUserCodeClassloader, hasKeyContext1, hasKeyContext2, initializeState, isUsingCustomRawKeyedState, notifyCheckpointAborted, processLatencyMarker, processLatencyMarker1, processLatencyMarker2, processWatermarkStatus, processWatermarkStatus1, processWatermarkStatus2, reportOrForwardLatencyMarker, setChainingStrategy, setCurrentKey, setKeyContextElement1, setKeyContextElement2, setProcessingTimeService, snapshotStateMethods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, waitMethods inherited from interface org.apache.flink.api.common.state.CheckpointListener
notifyCheckpointAbortedMethods inherited from interface org.apache.flink.streaming.api.operators.Input
processLatencyMarker, processWatermarkStatusMethods inherited from interface org.apache.flink.streaming.api.operators.KeyContext
getCurrentKey, setCurrentKeyMethods inherited from interface org.apache.flink.streaming.api.operators.KeyContextHandler
hasKeyContextMethods inherited from interface org.apache.flink.streaming.api.operators.OneInputStreamOperator
setKeyContextElementMethods inherited from interface org.apache.flink.streaming.api.operators.StreamOperator
getMetricGroup, getOperatorID, initializeState, setKeyContextElement1, setKeyContextElement2, snapshotStateMethods inherited from interface org.apache.flink.streaming.api.operators.TwoInputStreamOperator
processLatencyMarker1, processLatencyMarker2, processWatermarkStatus1, processWatermarkStatus2
-
Field Details
-
doFn
-
serializedOptions
protected final org.apache.beam.runners.core.construction.SerializablePipelineOptions serializedOptions -
mainOutputTag
-
additionalOutputTags
-
sideInputs
-
sideInputTagMapping
-
windowingStrategy
-
outputManagerFactory
protected final org.apache.beam.runners.flink.translation.wrappers.streaming.DoFnOperator.OutputManagerFactory<OutputT> outputManagerFactory -
doFnRunner
-
pushbackDoFnRunner
-
bufferingDoFnRunner
-
sideInputHandler
protected transient org.apache.beam.runners.core.SideInputHandler sideInputHandler -
sideInputReader
protected transient org.apache.beam.runners.core.SideInputReader sideInputReader -
outputManager
-
keyedStateInternals
-
timerInternals
protected transient DoFnOperator<PreInputT,InputT, timerInternalsOutputT>.org.apache.beam.runners.flink.translation.wrappers.streaming.DoFnOperator.FlinkTimerInternals -
stepName
-
timerService
protected transient org.apache.flink.streaming.api.operators.InternalTimerService<org.apache.beam.runners.core.TimerInternals.TimerData> timerService
-
-
Constructor Details
-
DoFnOperator
public DoFnOperator(@Nullable DoFn<InputT, OutputT> doFn, String stepName, Coder<WindowedValue<InputT>> inputWindowedCoder, Map<TupleTag<?>, Coder<?>> outputCoders, TupleTag<OutputT> mainOutputTag, List<TupleTag<?>> additionalOutputTags, org.apache.beam.runners.flink.translation.wrappers.streaming.DoFnOperator.OutputManagerFactory<OutputT> outputManagerFactory, WindowingStrategy<?, ?> windowingStrategy, Map<Integer, PCollectionView<?>> sideInputTagMapping, Collection<PCollectionView<?>> sideInputs, PipelineOptions options, @Nullable Coder<?> keyCoder, @Nullable org.apache.flink.api.java.functions.KeySelector<WindowedValue<InputT>, ?> keySelector, DoFnSchemaInformation doFnSchemaInformation, Map<String, PCollectionView<?>> sideInputMapping) Constructor for DoFnOperator.
-
-
Method Details
-
getDoFn
-
preProcess
-
createWrappingDoFnRunner
-
setup
public void setup(org.apache.flink.streaming.runtime.tasks.StreamTask<?, ?> containingTask, org.apache.flink.streaming.api.graph.StreamConfig config, org.apache.flink.streaming.api.operators.Output<org.apache.flink.streaming.runtime.streamrecord.StreamRecord<WindowedValue<OutputT>>> output) - Specified by:
setupin interfaceorg.apache.flink.streaming.api.operators.SetupableStreamOperator<PreInputT>- Overrides:
setupin classorg.apache.flink.streaming.api.operators.AbstractStreamOperator<WindowedValue<OutputT>>
-
shoudBundleElements
protected boolean shoudBundleElements() -
initializeState
public void initializeState(org.apache.flink.runtime.state.StateInitializationContext context) throws Exception - Specified by:
initializeStatein interfaceorg.apache.flink.streaming.api.operators.StreamOperatorStateHandler.CheckpointedStreamOperator- Overrides:
initializeStatein classorg.apache.flink.streaming.api.operators.AbstractStreamOperator<WindowedValue<OutputT>>- Throws:
Exception
-
getLockToAcquireForStateAccessDuringBundles
Subclasses may provide a lock to ensure that the state backend is not accessed concurrently during bundle execution. -
open
- Specified by:
openin interfaceorg.apache.flink.streaming.api.operators.StreamOperator<PreInputT>- Overrides:
openin classorg.apache.flink.streaming.api.operators.AbstractStreamOperator<WindowedValue<OutputT>>- Throws:
Exception
-
finish
- Specified by:
finishin interfaceorg.apache.flink.streaming.api.operators.StreamOperator<PreInputT>- Overrides:
finishin classorg.apache.flink.streaming.api.operators.AbstractStreamOperator<WindowedValue<OutputT>>- Throws:
Exception
-
close
- Specified by:
closein interfaceorg.apache.flink.streaming.api.operators.StreamOperator<PreInputT>- Overrides:
closein classorg.apache.flink.streaming.api.operators.AbstractStreamOperator<WindowedValue<OutputT>>- Throws:
Exception
-
numProcessingTimeTimers
protected int numProcessingTimeTimers() -
getEffectiveInputWatermark
public long getEffectiveInputWatermark() -
getCurrentOutputWatermark
public long getCurrentOutputWatermark() -
setPreBundleCallback
-
setBundleFinishedCallback
-
processElement
public final void processElement(org.apache.flink.streaming.runtime.streamrecord.StreamRecord<WindowedValue<PreInputT>> streamRecord) - Specified by:
processElementin interfaceorg.apache.flink.streaming.api.operators.Input<PreInputT>
-
processElement1
public final void processElement1(org.apache.flink.streaming.runtime.streamrecord.StreamRecord<WindowedValue<PreInputT>> streamRecord) throws Exception -
addSideInputValue
protected void addSideInputValue(org.apache.flink.streaming.runtime.streamrecord.StreamRecord<RawUnionValue> streamRecord) Add the side input value. Here we are assuming that views have already been materialized and are sent over the wire asIterable. Subclasses may elect to perform materialization in state and receive side input incrementally instead.- Parameters:
streamRecord-
-
processElement2
public final void processElement2(org.apache.flink.streaming.runtime.streamrecord.StreamRecord<RawUnionValue> streamRecord) throws Exception -
processWatermark
public final void processWatermark(org.apache.flink.streaming.api.watermark.Watermark mark) throws Exception - Specified by:
processWatermarkin interfaceorg.apache.flink.streaming.api.operators.Input<PreInputT>- Overrides:
processWatermarkin classorg.apache.flink.streaming.api.operators.AbstractStreamOperator<WindowedValue<OutputT>>- Throws:
Exception
-
processWatermark1
public final void processWatermark1(org.apache.flink.streaming.api.watermark.Watermark mark) throws Exception -
applyInputWatermarkHold
public long applyInputWatermarkHold(long inputWatermark) Allows to apply a hold to the input watermark. By default, just passes the input watermark through. -
applyOutputWatermarkHold
public long applyOutputWatermarkHold(long currentOutputWatermark, long potentialOutputWatermark) Allows to apply a hold to the output watermark before it is sent out. Used to apply hold on output watermark for delayed (asynchronous or buffered) processing.- Parameters:
currentOutputWatermark- the current output watermarkpotentialOutputWatermark- The potential new output watermark which can be adjusted, if needed. The input watermark hold has already been applied.- Returns:
- The new output watermark which will be emitted.
-
processWatermark2
public final void processWatermark2(org.apache.flink.streaming.api.watermark.Watermark mark) throws Exception -
scheduleForCurrentProcessingTime
protected void scheduleForCurrentProcessingTime(org.apache.flink.api.common.operators.ProcessingTimeService.ProcessingTimeCallback callback) -
invokeFinishBundle
protected final void invokeFinishBundle() -
prepareSnapshotPreBarrier
public void prepareSnapshotPreBarrier(long checkpointId) - Specified by:
prepareSnapshotPreBarrierin interfaceorg.apache.flink.streaming.api.operators.StreamOperator<PreInputT>- Overrides:
prepareSnapshotPreBarrierin classorg.apache.flink.streaming.api.operators.AbstractStreamOperator<WindowedValue<OutputT>>
-
snapshotState
public void snapshotState(org.apache.flink.runtime.state.StateSnapshotContext context) throws Exception - Specified by:
snapshotStatein interfaceorg.apache.flink.streaming.api.operators.StreamOperatorStateHandler.CheckpointedStreamOperator- Overrides:
snapshotStatein classorg.apache.flink.streaming.api.operators.AbstractStreamOperator<WindowedValue<OutputT>>- Throws:
Exception
-
getBundleFinalizer
-
notifyCheckpointComplete
- Specified by:
notifyCheckpointCompletein interfaceorg.apache.flink.api.common.state.CheckpointListener- Overrides:
notifyCheckpointCompletein classorg.apache.flink.streaming.api.operators.AbstractStreamOperator<WindowedValue<OutputT>>- Throws:
Exception
-
onEventTime
public void onEventTime(org.apache.flink.streaming.api.operators.InternalTimer<FlinkKey, org.apache.beam.runners.core.TimerInternals.TimerData> timer) -
onProcessingTime
public void onProcessingTime(org.apache.flink.streaming.api.operators.InternalTimer<FlinkKey, org.apache.beam.runners.core.TimerInternals.TimerData> timer) -
fireTimerInternal
protected void fireTimerInternal(FlinkKey key, org.apache.beam.runners.core.TimerInternals.TimerData timerData) -
fireTimer
protected void fireTimer(org.apache.beam.runners.core.TimerInternals.TimerData timerData)
-