apache_beam.io.watermark_estimators module
A collection of WatermarkEstimator implementations that SplittableDoFns can use.
- class apache_beam.io.watermark_estimators.MonotonicWatermarkEstimator(timestamp)[source]
Bases:
WatermarkEstimator
A WatermarkEstimator which assumes that timestamps of all ouput records are increasing monotonically.
For a new <element, restriction> pair, the initial value is None. When resuming processing, the initial timestamp will be the last reported watermark.
- class apache_beam.io.watermark_estimators.WalltimeWatermarkEstimator(timestamp=None)[source]
Bases:
WatermarkEstimator
A WatermarkEstimator which uses processing time as the estimated watermark.
- class apache_beam.io.watermark_estimators.ManualWatermarkEstimator(watermark)[source]
Bases:
WatermarkEstimator
A WatermarkEstimator which is controlled manually from within a DoFn.
The DoFn must invoke set_watermark to advance the watermark.
- set_watermark(timestamp)[source]
Sets a timestamp before or at the timestamps of all future elements produced by the associated DoFn.
This can be approximate. If records are output that violate this guarantee, they will be considered late, which will affect how they will be processed. See https://beam.apache.org/documentation/programming-guide/#watermarks-and-late-data for more information on late data and how to handle it.
However, this value should be as late as possible. Downstream windows may not be able to close until this watermark passes their end.