Class ReadChangeStreamPartitionDoFn
java.lang.Object
org.apache.beam.sdk.transforms.DoFn<PartitionMetadata,DataChangeRecord>
 
org.apache.beam.sdk.io.gcp.spanner.changestreams.dofn.ReadChangeStreamPartitionDoFn
- All Implemented Interfaces:
- Serializable,- HasDisplayData
@UnboundedPerElement
public class ReadChangeStreamPartitionDoFn
extends DoFn<PartitionMetadata,DataChangeRecord>
implements Serializable 
A SDF (Splittable DoFn) class which is responsible for performing a change stream query for a
 given partition. A different action will be taken depending on the type of record received from
 the query. This component will also reflect the partition state in the partition metadata tables.
 
The processing of a partition is delegated to the QueryChangeStreamAction.
- See Also:
- 
Nested Class SummaryNested classes/interfaces inherited from class org.apache.beam.sdk.transforms.DoFnDoFn.AlwaysFetched, DoFn.BoundedPerElement, DoFn.BundleFinalizer, DoFn.Element, DoFn.FieldAccess, DoFn.FinishBundle, DoFn.FinishBundleContext, DoFn.GetInitialRestriction, DoFn.GetInitialWatermarkEstimatorState, DoFn.GetRestrictionCoder, DoFn.GetSize, DoFn.GetWatermarkEstimatorStateCoder, DoFn.Key, DoFn.MultiOutputReceiver, DoFn.NewTracker, DoFn.NewWatermarkEstimator, DoFn.OnTimer, DoFn.OnTimerContext, DoFn.OnTimerFamily, DoFn.OnWindowExpiration, DoFn.OnWindowExpirationContext, DoFn.OutputReceiver<T>, DoFn.ProcessContext, DoFn.ProcessContinuation, DoFn.ProcessElement, DoFn.RequiresStableInput, DoFn.RequiresTimeSortedInput, DoFn.Restriction, DoFn.Setup, DoFn.SideInput, DoFn.SplitRestriction, DoFn.StartBundle, DoFn.StartBundleContext, DoFn.StateId, DoFn.Teardown, DoFn.TimerFamily, DoFn.TimerId, DoFn.Timestamp, DoFn.TruncateRestriction, DoFn.UnboundedPerElement, DoFn.WatermarkEstimatorState, DoFn.WindowedContext
- 
Constructor SummaryConstructorsConstructorDescriptionReadChangeStreamPartitionDoFn(DaoFactory daoFactory, MapperFactory mapperFactory, ActionFactory actionFactory, ChangeStreamMetrics metrics) This class needs aDaoFactoryto build DAOs to access the partition metadata tables and to perform the change streams query.
- 
Method SummaryModifier and TypeMethodDescriptiondoublegetSize(PartitionMetadata partition, TimestampRange range) initialRestriction(PartitionMetadata partition) The restriction for a partition will be defined from the start and end timestamp to query the partition for.newTracker(PartitionMetadata partition, TimestampRange range) newWatermarkEstimator(Instant watermarkEstimatorState) processElement(PartitionMetadata partition, RestrictionTracker<TimestampRange, com.google.cloud.Timestamp> tracker, DoFn.OutputReceiver<DataChangeRecord> receiver, ManualWatermarkEstimator<Instant> watermarkEstimator, DoFn.BundleFinalizer bundleFinalizer) Performs a change stream query for a given partition.voidsetThroughputEstimator(BytesThroughputEstimator<DataChangeRecord> throughputEstimator) Sets the estimator to calculate the backlog of this function.voidsetup()Constructs instances for thePartitionMetadataDao,ChangeStreamDao,ChangeStreamRecordMapper,PartitionMetadataMapper,DataChangeRecordAction,HeartbeatRecordAction,ChildPartitionsRecordAction,PartitionStartRecordAction,PartitionEndRecordAction,PartitionEventRecordActionandQueryChangeStreamAction.Methods inherited from class org.apache.beam.sdk.transforms.DoFngetAllowedTimestampSkew, getInputTypeDescriptor, getOutputTypeDescriptor, populateDisplayData, prepareForProcessing
- 
Constructor Details- 
ReadChangeStreamPartitionDoFnpublic ReadChangeStreamPartitionDoFn(DaoFactory daoFactory, MapperFactory mapperFactory, ActionFactory actionFactory, ChangeStreamMetrics metrics) This class needs aDaoFactoryto build DAOs to access the partition metadata tables and to perform the change streams query. It uses mappers to transform database rows into theChangeStreamRecordmodel. It uses theActionFactoryto construct the action dispatchers, which will perform the change stream query and process each type of record received. It emits metrics for the partition using theChangeStreamMetrics.- Parameters:
- daoFactory- the- DaoFactoryto construct- PartitionMetadataDaos and- ChangeStreamDaos
- mapperFactory- the- MapperFactoryto construct- ChangeStreamRecordMappers
- actionFactory- the- ActionFactoryto construct actions
- metrics- the- ChangeStreamMetricsto emit partition related metrics
 
 
- 
- 
Method Details- 
getInitialWatermarkEstimatorState@GetInitialWatermarkEstimatorState public Instant getInitialWatermarkEstimatorState(@Element PartitionMetadata partition) 
- 
newWatermarkEstimator@NewWatermarkEstimator public ManualWatermarkEstimator<Instant> newWatermarkEstimator(@WatermarkEstimatorState Instant watermarkEstimatorState) 
- 
initialRestriction@GetInitialRestriction public TimestampRange initialRestriction(@Element PartitionMetadata partition) The restriction for a partition will be defined from the start and end timestamp to query the partition for. TheTimestampRangerestriction represents a closed-open interval, while the start / end timestamps represent a closed-closed interval, so we add 1 nanosecond to the end timestamp to convert it to closed-open.In this function we also update the partition state to PartitionMetadata.State.RUNNING.- Parameters:
- partition- the partition to be queried
- Returns:
- the timestamp range from the partition start timestamp to the partition end timestamp + 1 nanosecond
 
- 
getSize@GetSize public double getSize(@Element PartitionMetadata partition, @Restriction TimestampRange range) throws Exception - Throws:
- Exception
 
- 
newTracker@NewTracker public ReadChangeStreamPartitionRangeTracker newTracker(@Element PartitionMetadata partition, @Restriction TimestampRange range) 
- 
setupConstructs instances for thePartitionMetadataDao,ChangeStreamDao,ChangeStreamRecordMapper,PartitionMetadataMapper,DataChangeRecordAction,HeartbeatRecordAction,ChildPartitionsRecordAction,PartitionStartRecordAction,PartitionEndRecordAction,PartitionEventRecordActionandQueryChangeStreamAction.
- 
processElement@ProcessElement public DoFn.ProcessContinuation processElement(@Element PartitionMetadata partition, RestrictionTracker<TimestampRange, com.google.cloud.Timestamp> tracker, DoFn.OutputReceiver<DataChangeRecord> receiver, ManualWatermarkEstimator<Instant> watermarkEstimator, DoFn.BundleFinalizer bundleFinalizer) Performs a change stream query for a given partition. A different action will be taken depending on the type of record received from the query. This component will also reflect the partition state in the partition metadata tables.The processing of a partition is delegated to the QueryChangeStreamAction.- Parameters:
- partition- the partition to be queried
- tracker- an instance of- ReadChangeStreamPartitionRangeTracker
- receiver- a- DataChangeRecord- DoFn.OutputReceiver
- watermarkEstimator- a- ManualWatermarkEstimatorof- Instant
- bundleFinalizer- the bundle finalizer
- Returns:
- a DoFn.ProcessContinuation.stop()if a record timestamp could not be claimed or if the partition processing has finished
 
- 
setThroughputEstimatorSets the estimator to calculate the backlog of this function. Must be called after the initialization of this DoFn.- Parameters:
- throughputEstimator- an estimator to calculate local throughput.
 
 
-