Class DetectNewPartitionsTracker

All Implemented Interfaces:
RestrictionTracker.HasProgress

@Internal public class DetectNewPartitionsTracker extends GrowableOffsetRangeTracker
  • Constructor Details

    • DetectNewPartitionsTracker

      public DetectNewPartitionsTracker(long start)
  • Method Details

    • trySplit

      public @Nullable SplitResult<OffsetRange> trySplit(double fractionOfRemainder)
      Description copied from class: RestrictionTracker
      Splits current restriction based on fractionOfRemainder.

      If splitting the current restriction is possible, the current restriction is split into a primary and residual restriction pair. This invocation updates the RestrictionTracker.currentRestriction() to be the primary restriction effectively having the current DoFn.ProcessElement execution responsible for performing the work that the primary restriction represents. The residual restriction will be executed in a separate DoFn.ProcessElement invocation (likely in a different process). The work performed by executing the primary and residual restrictions as separate DoFn.ProcessElement invocations MUST be equivalent to the work performed as if this split never occurred.

      The fractionOfRemainder should be used in a best effort manner to choose a primary and residual restriction based upon the fraction of the remaining work that the current DoFn.ProcessElement invocation is responsible for. For example, if a DoFn.ProcessElement was reading a file with a restriction representing the offset range [100, 200) and has processed up to offset 130 with a fractionOfRemainder of 0.7, the primary and residual restrictions returned would be [100, 179), [179, 200) (note: currentOffset + fractionOfRemainder * remainingWork = 130 + 0.7 * 70 = 179).

      fractionOfRemainder = 0 means a checkpoint is required.

      The API is recommended to be implemented for a batch pipeline to improve parallel processing performance.

      The API is recommended to be implemented for batch pipeline given that it is very important for pipeline scaling and end to end pipeline execution.

      The API is required to be implemented for a streaming pipeline.

      Overrides:
      trySplit in class GrowableOffsetRangeTracker
      Parameters:
      fractionOfRemainder - A hint as to the fraction of work the primary restriction should represent based upon the current known remaining amount of work.
      Returns:
      a SplitResult if a split was possible, otherwise returns null. If the fractionOfRemainder == 0, a null result MUST imply that the restriction tracker is done and there is no more work left to do.