@Experimental(value=SPLITTABLE_DO_FN) public abstract class RestrictionTracker<RestrictionT,PositionT> extends java.lang.Object
RestrictionTrackers should implement
RestrictionTracker.HasProgress otherwise poor auto-scaling
of workers and/or splitting may result if the progress is an inaccurate representation of the
known amount of completed and remaining work.
|Modifier and Type||Class and Description|
A representation for the amount of known completed and remaining work.
|Constructor and Description|
|Modifier and Type||Method and Description|
Called by the runner after
Returns a restriction accurately describing the full range of work the current
Attempts to claim the block of work in the current restriction identified by the given position.
Splits current restriction based on
public abstract boolean tryClaim(PositionT position)
If this succeeds, the DoFn MUST execute the entire block of work. If this fails:
DoFn.ProcessContinuation#stopwithout performing any additional work or emitting output (note that emitting output or performing work from
DoFn.ProcessElementis also not allowed before the first call to this method).
public abstract RestrictionT currentRestriction()
DoFn.ProcessElementcall will do, including already completed work.
@Nullable public abstract SplitResult<RestrictionT> trySplit(double fractionOfRemainder)
If splitting the current restriction is possible, the current restriction is split into a
primary and residual restriction pair. This invocation updates the
currentRestriction() to be the primary restriction effectively having the current
DoFn.ProcessElement execution responsible for performing the work that the primary restriction
represents. The residual restriction will be executed in a separate
invocation (likely in a different process). The work performed by executing the primary and
residual restrictions as separate
DoFn.ProcessElement invocations MUST be equivalent to
the work performed as if this split never occurred.
fractionOfRemainder should be used in a best effort manner to choose a primary
and residual restriction based upon the fraction of the remaining work that the current
DoFn.ProcessElement invocation is responsible for. For example, if a
DoFn.ProcessElement was reading a file with a restriction representing the offset range
[100, 200) and has processed up to offset 130 with a
0.7, the primary and residual restrictions returned would be
[100, 179), [179, 200)
currentOffset + fractionOfRemainder * remainingWork = 130 + 0.7 * 70 = 179).
fractionOfRemainder = 0 means a checkpoint is required.
The API is recommended to be implemented for a batch pipeline to improve parallel processing performance.
The API is required to be implemented for a streaming pipeline.
fractionOfRemainder- A hint as to the fraction of work the primary restriction should represent based upon the current known remaining amount of work.
SplitResultif a split was possible, otherwise returns
null. If the
fractionOfRemainder == 0, a
nullresult MUST imply that the restriction tracker is done and there is no more work left to do.
public abstract void checkDone() throws java.lang.IllegalStateException
Must throw an exception with an informative error message, if there is still any unclaimed work remaining in the restriction.