@DoFn.UnboundedPerElement public class DetectNewPartitionsDoFn extends DoFn<byte[],PartitionMetadata>
PartitionMetadata.State.CREATED
, update their state to PartitionMetadata.State.SCHEDULED
and output them to the next
stage in the pipeline.DoFn.AlwaysFetched, DoFn.BoundedPerElement, DoFn.BundleFinalizer, DoFn.Element, DoFn.FieldAccess, DoFn.FinishBundle, DoFn.FinishBundleContext, DoFn.GetInitialRestriction, DoFn.GetInitialWatermarkEstimatorState, DoFn.GetRestrictionCoder, DoFn.GetSize, DoFn.GetWatermarkEstimatorStateCoder, DoFn.Key, DoFn.MultiOutputReceiver, DoFn.NewTracker, DoFn.NewWatermarkEstimator, DoFn.OnTimer, DoFn.OnTimerContext, DoFn.OnTimerFamily, DoFn.OnWindowExpiration, DoFn.OnWindowExpirationContext, DoFn.OutputReceiver<T>, DoFn.ProcessContext, DoFn.ProcessContinuation, DoFn.ProcessElement, DoFn.RequiresStableInput, DoFn.RequiresTimeSortedInput, DoFn.Restriction, DoFn.Setup, DoFn.SideInput, DoFn.SplitRestriction, DoFn.StartBundle, DoFn.StartBundleContext, DoFn.StateId, DoFn.Teardown, DoFn.TimerFamily, DoFn.TimerId, DoFn.Timestamp, DoFn.TruncateRestriction, DoFn.UnboundedPerElement, DoFn.WatermarkEstimatorState, DoFn.WindowedContext
Constructor and Description |
---|
DetectNewPartitionsDoFn(DaoFactory daoFactory,
MapperFactory mapperFactory,
ChangeStreamMetrics metrics)
This class needs a
DaoFactory to build DAOs to access the partition metadata tables. |
DetectNewPartitionsDoFn(DaoFactory daoFactory,
MapperFactory mapperFactory,
ChangeStreamMetrics metrics,
Duration resumeDuration)
This class needs a
DaoFactory to build DAOs to access the partition metadata tables. |
Modifier and Type | Method and Description |
---|---|
Instant |
getInitialWatermarkEstimatorState(Instant currentElementTimestamp) |
OffsetRange |
initialRestriction()
Uses an
OffsetRange with a max range. |
ManualWatermarkEstimator<Instant> |
newWatermarkEstimator(Instant watermarkEstimatorState) |
DoFn.ProcessContinuation |
processElement(RestrictionTracker<OffsetRange,java.lang.Long> tracker,
DoFn.OutputReceiver<PartitionMetadata> receiver,
ManualWatermarkEstimator<Instant> watermarkEstimator)
Main processing function for the
DetectNewPartitionsDoFn function. |
OffsetRangeTracker |
restrictionTracker(OffsetRange restriction) |
void |
setup()
Obtains the instances of
PartitionMetadataDao and PartitionMetadataMapper from
their respective factories. |
getAllowedTimestampSkew, getInputTypeDescriptor, getOutputTypeDescriptor, populateDisplayData, prepareForProcessing
public DetectNewPartitionsDoFn(DaoFactory daoFactory, MapperFactory mapperFactory, ChangeStreamMetrics metrics)
DaoFactory
to build DAOs to access the partition metadata tables. It
uses mappers to transform database rows into the PartitionMetadata
model. It emits
metrics for the partitions read using the ChangeStreamMetrics
. This constructors sets
the the periodic re-execution of the component to be scheduled using the DEFAULT_RESUME_DURATION
duration (best effort).daoFactory
- the DaoFactory
to construct PartitionMetadataDao
smapperFactory
- the MapperFactory
to construct PartitionMetadataMapper
smetrics
- the ChangeStreamMetrics
to emit partition related metricspublic DetectNewPartitionsDoFn(DaoFactory daoFactory, MapperFactory mapperFactory, ChangeStreamMetrics metrics, Duration resumeDuration)
DaoFactory
to build DAOs to access the partition metadata tables. It
uses mappers to transform database rows into the PartitionMetadata
model. It emits
metrics for the partitions read using the ChangeStreamMetrics
. It re-schedules the
process element function to be executed according to the specified duration (best effort).daoFactory
- the DaoFactory
to construct PartitionMetadataDao
smapperFactory
- the MapperFactory
to construct PartitionMetadataMapper
smetrics
- the ChangeStreamMetrics
to emit partition related metricsresumeDuration
- specifies the periodic schedule to re-execute this component@DoFn.GetInitialWatermarkEstimatorState public Instant getInitialWatermarkEstimatorState(@DoFn.Timestamp Instant currentElementTimestamp)
@DoFn.NewWatermarkEstimator public ManualWatermarkEstimator<Instant> newWatermarkEstimator(@DoFn.WatermarkEstimatorState Instant watermarkEstimatorState)
@DoFn.GetInitialRestriction public OffsetRange initialRestriction()
OffsetRange
with a max range. This is because it does not know before hand how
many partitions it will schedule.
In order to circumvent a bug in Apache Beam
(https://issues.apache.org/jira/browse/BEAM-12756) we don't use Long.MAX_VALUE
, but
rather a value slightly smaller.
@DoFn.NewTracker public OffsetRangeTracker restrictionTracker(@DoFn.Restriction OffsetRange restriction)
@DoFn.Setup public void setup()
PartitionMetadataDao
and PartitionMetadataMapper
from
their respective factories.@DoFn.ProcessElement public DoFn.ProcessContinuation processElement(RestrictionTracker<OffsetRange,java.lang.Long> tracker, DoFn.OutputReceiver<PartitionMetadata> receiver, ManualWatermarkEstimator<Instant> watermarkEstimator)
DetectNewPartitionsDoFn
function. It follows this
procedure periodically:
PartitionMetadata.State.CREATED
.
PartitionMetadata.State.SCHEDULED
.
tracker
- an instance of OffsetRangeTracker
receiver
- a PartitionMetadata
OutputReceiver
watermarkEstimator
- a ManualWatermarkEstimator
of Instant
ProcessContinuation#stop()
if there are no more partitions to process or
ProcessContinuation#resume()
to re-schedule the function after the configured
interval.