Package org.apache.beam.sdk.io.kafka
Class CustomTimestampPolicyWithLimitedDelay<K,V>
java.lang.Object
org.apache.beam.sdk.io.kafka.TimestampPolicy<K,V>
org.apache.beam.sdk.io.kafka.CustomTimestampPolicyWithLimitedDelay<K,V>
A policy for custom record timestamps where timestamps within a partition are expected to be
roughly monotonically increasing with a cap on out of order event delays (say 1 minute). The
watermark at any time is '(
Min(now(), Max(event timestamp so far)) - max delay
)'.
However, watermark is never set in future and capped to 'now - max delay'. In addition, watermark
advanced to 'now - max delay' when a partition is idle.-
Nested Class Summary
Nested classes/interfaces inherited from class org.apache.beam.sdk.io.kafka.TimestampPolicy
TimestampPolicy.PartitionContext
-
Constructor Summary
ConstructorsConstructorDescriptionCustomTimestampPolicyWithLimitedDelay
(SerializableFunction<KafkaRecord<K, V>, Instant> timestampFunction, Duration maxDelay, Optional<Instant> previousWatermark) A policy for custom record timestamps where timestamps are expected to be roughly monotonically increasing with out of order event delays less thanmaxDelay
. -
Method Summary
Modifier and TypeMethodDescriptiongetTimestampForRecord
(TimestampPolicy.PartitionContext ctx, KafkaRecord<K, V> record) Returns record timestamp (aka event time).Returns watermark for the partition.
-
Constructor Details
-
CustomTimestampPolicyWithLimitedDelay
public CustomTimestampPolicyWithLimitedDelay(SerializableFunction<KafkaRecord<K, V>, Instant> timestampFunction, Duration maxDelay, Optional<Instant> previousWatermark) A policy for custom record timestamps where timestamps are expected to be roughly monotonically increasing with out of order event delays less thanmaxDelay
. The watermark at any time isMin(now(), max_event_timestamp) - maxDelay
.- Parameters:
timestampFunction
- A function to extract timestamp from the recordmaxDelay
- For any record in the Kafka partition, the timestamp of any subsequent record is expected to be aftercurrent record timestamp - maxDelay
.previousWatermark
- Latest check-pointed watermark, seeTimestampPolicyFactory.createTimestampPolicy(TopicPartition, Optional)
-
-
Method Details
-
getTimestampForRecord
Description copied from class:TimestampPolicy
Returns record timestamp (aka event time). This is often based on the timestamp of the Kafka record. This is invoked for each record when it is processed in the reader.- Specified by:
getTimestampForRecord
in classTimestampPolicy<K,
V>
-
getWatermark
Description copied from class:TimestampPolicy
Returns watermark for the partition. It is the timestamp before or at the timestamps of all future records consumed from the partition. SeeUnboundedSource.UnboundedReader.getWatermark()
for more guidance on watermarks. E.g. if the record timestamp is 'LogAppendTime', watermark would be the timestamp of the last record since 'LogAppendTime' monotonically increases within a partition.- Specified by:
getWatermark
in classTimestampPolicy<K,
V>
-