@Internal public class WatermarkManager<ExecutableT,CollectionT> extends java.lang.Object
PCollections and input and output watermarks of AppliedPTransforms to provide event-time and completion tracking for in-memory
execution. WatermarkManager is designed to update and return a consistent view of
watermarks in the presence of concurrent updates.
An WatermarkManager is provided with the collection of root AppliedPTransforms and a map of PCollections to all the AppliedPTransforms that consume them at construction time.
Whenever a root executable produces elements, the WatermarkManager is provided with the produced elements and the output watermark of the
producing executable. The watermark manager is
responsible for computing the watermarks of all transforms that consume
one or more PCollections.
Whenever a non-root AppliedPTransform finishes processing one or more in-flight
elements (referred to as the input bundle), the following occurs
atomically:
AppliedPTransform.
AppliedPTransform are added to the collection
of pending elements for each AppliedPTransform that consumes them.
AppliedPTransform becomes the maximum value of
PCollection watermarks
AppliedPTransform becomes the maximum of
PCollection can be advanced to the output watermark of
the AppliedPTransform
AppliedPTransforms can be
advanced.
The watermark of a PCollection is equal to the output watermark of the AppliedPTransform that produces it.
The watermarks for a PTransform are updated as follows when output is committed:
Watermark_In' = MAX(Watermark_In, MIN(U(TS_Pending), U(Watermark_InputPCollection))) Watermark_Out' = MAX(Watermark_Out, MIN(Watermark_In', U(StateHold))) Watermark_PCollection = Watermark_Out_ProducingPTransform
| Modifier and Type | Class and Description |
|---|---|
static class |
WatermarkManager.FiredTimers<ExecutableT>
A pair of
TimerInternals.TimerData and key which can be delivered to the appropriate AppliedPTransform. |
static class |
WatermarkManager.TimerUpdate
A collection of newly set, deleted, and completed timers.
|
class |
WatermarkManager.TransformWatermarks
A reference to the input and output watermarks of an
AppliedPTransform. |
| Modifier and Type | Method and Description |
|---|---|
static <ExecutableT,CollectionT> |
create(Clock clock,
ExecutableGraph<ExecutableT,? super CollectionT> graph,
java.util.function.Function<ExecutableT,java.lang.String> getName)
Creates a new
WatermarkManager. |
java.util.Collection<WatermarkManager.FiredTimers<ExecutableT>> |
extractFiredTimers()
Returns a map of each
PTransform that has pending timers to those timers. |
WatermarkManager.TransformWatermarks |
getWatermarks(ExecutableT executable)
Gets the input and output watermarks for an
AppliedPTransform. |
void |
initialize(java.util.Map<ExecutableT,? extends java.lang.Iterable<Bundle<?,CollectionT>>> initialBundles) |
void |
refreshAll()
Refresh the watermarks contained within this
WatermarkManager, causing all watermarks
to be advanced as far as possible. |
void |
updateWatermarks(Bundle<?,? extends CollectionT> completed,
WatermarkManager.TimerUpdate timerUpdate,
ExecutableT executable,
Bundle<?,? extends CollectionT> unprocessedInputs,
java.lang.Iterable<? extends Bundle<?,? extends CollectionT>> outputs,
Instant earliestHold)
Updates the watermarks of a executable with one or more inputs.
|
public static <ExecutableT,CollectionT> WatermarkManager<ExecutableT,? super CollectionT> create(Clock clock, ExecutableGraph<ExecutableT,? super CollectionT> graph, java.util.function.Function<ExecutableT,java.lang.String> getName)
WatermarkManager. All watermarks within the newly created WatermarkManager start at BoundedWindow.TIMESTAMP_MIN_VALUE, the minimum watermark,
with no watermark holds or pending elements.clock - the clock to use to determine processing timegraph - the graph representing this pipelinegetName - a function for producing a short identifier for the executable in watermark
tracing log messages.public WatermarkManager.TransformWatermarks getWatermarks(ExecutableT executable)
AppliedPTransform. If the PTransform has not processed any elements, return a watermark of BoundedWindow.TIMESTAMP_MIN_VALUE.public void initialize(java.util.Map<ExecutableT,? extends java.lang.Iterable<Bundle<?,CollectionT>>> initialBundles)
public void updateWatermarks(@Nullable Bundle<?,? extends CollectionT> completed, WatermarkManager.TimerUpdate timerUpdate, ExecutableT executable, @Nullable Bundle<?,? extends CollectionT> unprocessedInputs, java.lang.Iterable<? extends Bundle<?,? extends CollectionT>> outputs, Instant earliestHold)
Each executable has two monotonically increasing watermarks: the input watermark, which can, at any time, be updated to equal:
MAX(CurrentInputWatermark, MIN(PendingElements, InputPCollectionWatermarks))and the output watermark, which can, at any time, be updated to equal:
MAX(CurrentOutputWatermark, MIN(InputWatermark, WatermarkHolds)).
Updates to watermarks may not be immediately visible.
completed - the input that has completedtimerUpdate - the timers that were added, removed, and completed as part of producing this
updateexecutable - the executable applied to completed to produce the outputsunprocessedInputs - inputs that could not be processedoutputs - outputs that were produced by the application of the executable to the
inputearliestHold - the earliest watermark hold in the executable's state.public void refreshAll()
WatermarkManager, causing all watermarks
to be advanced as far as possible.public java.util.Collection<WatermarkManager.FiredTimers<ExecutableT>> extractFiredTimers()
PTransform that has pending timers to those timers. All of the
pending timers will be removed from this WatermarkManager.