Package org.apache.beam.sdk.io.kafka
Class CustomTimestampPolicyWithLimitedDelay<K,V>
java.lang.Object
org.apache.beam.sdk.io.kafka.TimestampPolicy<K,V>
org.apache.beam.sdk.io.kafka.CustomTimestampPolicyWithLimitedDelay<K,V>
A policy for custom record timestamps where timestamps within a partition are expected to be
roughly monotonically increasing with a cap on out of order event delays (say 1 minute). The
watermark at any time is '(
Min(now(), Max(event timestamp so far)) - max delay)'.
However, watermark is never set in future and capped to 'now - max delay'. In addition, watermark
advanced to 'now - max delay' when a partition is idle.-
Nested Class Summary
Nested classes/interfaces inherited from class org.apache.beam.sdk.io.kafka.TimestampPolicy
TimestampPolicy.PartitionContext -
Constructor Summary
ConstructorsConstructorDescriptionCustomTimestampPolicyWithLimitedDelay(SerializableFunction<KafkaRecord<K, V>, Instant> timestampFunction, Duration maxDelay, Optional<Instant> previousWatermark) A policy for custom record timestamps where timestamps are expected to be roughly monotonically increasing with out of order event delays less thanmaxDelay. -
Method Summary
Modifier and TypeMethodDescriptiongetTimestampForRecord(TimestampPolicy.PartitionContext ctx, KafkaRecord<K, V> record) Returns record timestamp (aka event time).Returns watermark for the partition.
-
Constructor Details
-
CustomTimestampPolicyWithLimitedDelay
public CustomTimestampPolicyWithLimitedDelay(SerializableFunction<KafkaRecord<K, V>, Instant> timestampFunction, Duration maxDelay, Optional<Instant> previousWatermark) A policy for custom record timestamps where timestamps are expected to be roughly monotonically increasing with out of order event delays less thanmaxDelay. The watermark at any time isMin(now(), max_event_timestamp) - maxDelay.- Parameters:
timestampFunction- A function to extract timestamp from the recordmaxDelay- For any record in the Kafka partition, the timestamp of any subsequent record is expected to be aftercurrent record timestamp - maxDelay.previousWatermark- Latest check-pointed watermark, seeTimestampPolicyFactory.createTimestampPolicy(TopicPartition, Optional)
-
-
Method Details
-
getTimestampForRecord
Description copied from class:TimestampPolicyReturns record timestamp (aka event time). This is often based on the timestamp of the Kafka record. This is invoked for each record when it is processed in the reader.- Specified by:
getTimestampForRecordin classTimestampPolicy<K,V>
-
getWatermark
Description copied from class:TimestampPolicyReturns watermark for the partition. It is the timestamp before or at the timestamps of all future records consumed from the partition. SeeUnboundedSource.UnboundedReader.getWatermark()for more guidance on watermarks. E.g. if the record timestamp is 'LogAppendTime', watermark would be the timestamp of the last record since 'LogAppendTime' monotonically increases within a partition.- Specified by:
getWatermarkin classTimestampPolicy<K,V>
-